markdown_link_checker_sc 0.0.118 → 0.0.119
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +25 -21
- package/_link_checker_sc/ignore_errors.json +137 -0
- package/index.js +108 -35
- package/package.json +3 -2
- package/src/errors.js +29 -4
- package/src/links.js +15 -4
- package/src/output_errors.js +51 -0
- package/src/process_markdown.js +34 -7
- package/src/process_markdown_reflinks.js +173 -0
- package/tests/links/linkreference/links/links.md +32 -0
- package/tests/links/linkreference/references.md +32 -0
package/README.md
CHANGED
|
@@ -12,19 +12,21 @@ Current version only does internal link checking
|
|
|
12
12
|
Usage: markdown_link_checker_sc [options]
|
|
13
13
|
|
|
14
14
|
Options:
|
|
15
|
-
-r, --root <path> Root directory of your source (i.e. root of github repo). Use -d as well to specify a folder if docs are not in the root, or to just
|
|
16
|
-
particular subfolder. Defaults to current directory. (default: "D:\\github\\hamishwillee\\markdown_link_checker_sc")
|
|
17
|
-
-d, --directory [directory] The directory to search for markdown and html files, relative to root - such as: `en` for an English subfolder. Default empty (same
|
|
18
|
-
directory) (default: "")
|
|
19
|
-
-i, --imagedir [directory] The directory to search for all image files for global orphan checking, relative to root - such as: `assets` or `en`. Default empty
|
|
20
|
-
explicitly set, and global orphan checking will not be done (default: "")
|
|
15
|
+
-r, --root <path> Root directory of your source (i.e. root of github repo). Use -d as well to specify a folder if docs are not in the root, or to just
|
|
16
|
+
run on particular subfolder. Defaults to current directory. (default: "D:\\github\\hamishwillee\\markdown_link_checker_sc")
|
|
17
|
+
-d, --directory [directory] The directory to search for markdown and html files, relative to root - such as: `en` for an English subfolder. Default empty (same
|
|
18
|
+
as -r directory) (default: "")
|
|
19
|
+
-i, --imagedir [directory] The directory to search for all image files for global orphan checking, relative to root - such as: `assets` or `en`. Default empty
|
|
20
|
+
if not explicitly set, and global orphan checking will not be done (default: "")
|
|
21
21
|
-c, --headingAnchorSlugify [value] Slugify approach for turning markdown headings into heading anchors. Currently support vuepress only and always (default: "vuepress")
|
|
22
22
|
-t, --tryMarkdownforHTML [value] Try a markdown file extension check if a link to HTML fails. (default: true)
|
|
23
|
-
-l, --log <types...>
|
|
24
|
-
-f, --files <path> JSON file with array of files to report on (default is all files). Paths are relative relative to -d by default, but -r can be used
|
|
25
|
-
different root. (default: "")
|
|
23
|
+
-l, --log <types...> Types of console logs to display logs for debugging. Types: functions, todo etc.
|
|
24
|
+
-f, --files <path> JSON file with array of files to report on (default is all files). Paths are relative relative to -d by default, but -r can be used
|
|
25
|
+
to set a different root. (default: "")
|
|
26
26
|
-s, --toc [value] full filename of TOC/Summary file in file system. If not specified, inferred from file with most links to other files
|
|
27
27
|
-u, --site_url [value] Site base url in form dev.example.com (used to catch absolute urls to local files)
|
|
28
|
+
-o, --logtofile [value] Output logs to file (default: true)
|
|
29
|
+
-p, --interactive [value] Interactively add errors to the ignore list at _link_checker_sc/ignore_errors.json (default: false)
|
|
28
30
|
-h, --help display help for command
|
|
29
31
|
```
|
|
30
32
|
|
|
@@ -37,7 +39,11 @@ Currently matches:
|
|
|
37
39
|
- ``
|
|
38
40
|
- `<a href="someurl#someanchor?someparams" title="sometitle">some text</a>`
|
|
39
41
|
- `<img src="someurl" title="sometitle" />`
|
|
40
|
-
|
|
42
|
+
- `<img src="someurl" title="sometitle" />`
|
|
43
|
+
- `[reference link text][reference name]`, where the reference is define as [reference name]: reference_url "reference title"
|
|
44
|
+
- Only supports reference name and text format - not "plain reference name" like `[reference name]`
|
|
45
|
+
- reference must be all on one line, and can have up to three whitespaces before it on line. May not have text after reference title.
|
|
46
|
+
|
|
41
47
|
|
|
42
48
|
> **Note:** It uses simple regexp. If you have a link commented out, or inside a code block that may well be captured.
|
|
43
49
|
|
|
@@ -46,25 +52,26 @@ There are heaps of link formats it does not match:
|
|
|
46
52
|
- `<http://www.whatever.com>` - doesn't support autolinks
|
|
47
53
|
- `www.fred.com` - Doesn't support auto-links external.
|
|
48
54
|
- `[](linkurl)`- Doesn't properly support a link around an image.
|
|
49
|
-
-
|
|
55
|
+
- Reference links where the reference is defined across lines.
|
|
50
56
|
|
|
51
57
|
|
|
52
58
|
Essentially lots of the other things https://github.github.com/gfm/
|
|
53
59
|
|
|
54
|
-
|
|
55
60
|
The regex that drives this is very simple.
|
|
56
61
|
|
|
57
|
-
|
|
58
62
|
There are many other alternatives, such as: https://github.com/tcort/markdown-link-check
|
|
59
63
|
You might also use a tokenziker or round trip to HTML using something like https://marked.js.org/using_advanced#inline in future as HTML is eaiser to extract links from.
|
|
60
64
|
|
|
61
65
|
This does catch a LOT of cases though, and is pretty quick.
|
|
62
66
|
|
|
63
|
-
##
|
|
67
|
+
## Also does
|
|
64
68
|
|
|
65
|
-
-
|
|
66
|
-
-
|
|
67
|
-
|
|
69
|
+
- Catches markdown files that are orphans - i.e. not linked by any file, or not linked by file which has the most links (normally the TOC file)
|
|
70
|
+
- Catches orphan images
|
|
71
|
+
- Allows you to specify that some errors are OK to ignore. These are stored in a file. See `-i` options\
|
|
72
|
+
|
|
73
|
+
|
|
74
|
+
## TODO
|
|
68
75
|
|
|
69
76
|
Anchors that are not url escaped can trip it up.
|
|
70
77
|
- You can URL escape them like this: [Airframe Reference](#underwater_robot_underwater_robot_hippocampus_uuv_%28unmanned_underwater_vehicle%29)
|
|
@@ -73,8 +80,6 @@ Anchors that are not url escaped can trip it up.
|
|
|
73
80
|
|
|
74
81
|
Anchors defined in id in a or span are caught. Need to check those in video, div are also caught and used in internal link checking.
|
|
75
82
|
|
|
76
|
-
A way to indicate that a particular error can be ignored - e.g. by page, type, maybe by line etc. Perhaps make this something that can be turned on and off.
|
|
77
|
-
|
|
78
83
|
Get images in/around the source files that are not linked - i.e. orphan images.
|
|
79
84
|
|
|
80
85
|
|
|
@@ -82,8 +87,7 @@ Get images in/around the source files that are not linked - i.e. orphan images.
|
|
|
82
87
|
# How does it work?
|
|
83
88
|
|
|
84
89
|
The way this works:
|
|
85
|
-
- Specify the directory and it will
|
|
86
|
-
h below that for all markdown/html files.
|
|
90
|
+
- Specify the directory and it will search below that for all markdown/html files.
|
|
87
91
|
- It loads each file, and:
|
|
88
92
|
- parses for markdown and html style links for both page and image links.
|
|
89
93
|
- parses headings and builds list of anchors in the page (as per vuepress) for those headings (poorly tested code)
|
|
@@ -0,0 +1,137 @@
|
|
|
1
|
+
[
|
|
2
|
+
{
|
|
3
|
+
"type": "InternalLinkToHTML",
|
|
4
|
+
"fileRelativeToRoot": "en\\flight_modes\\README.md",
|
|
5
|
+
"link": {
|
|
6
|
+
"url": "../flight_modes/altitude_fw.html",
|
|
7
|
+
"text": "Altitude"
|
|
8
|
+
},
|
|
9
|
+
"hideReason": "Gas hasdfhasfkl"
|
|
10
|
+
},
|
|
11
|
+
{
|
|
12
|
+
"type": "LinkedFileMissingAnchor",
|
|
13
|
+
"fileRelativeToRoot": "en\\config_heli\\README.md",
|
|
14
|
+
"link": {
|
|
15
|
+
"url": "../airframes/airframe_reference.md#copter_helicopter_generic_helicopter_%28tail_esc%29",
|
|
16
|
+
"text": "Generic Helicopter - with Tail ESC"
|
|
17
|
+
},
|
|
18
|
+
"hideReason": "n"
|
|
19
|
+
},
|
|
20
|
+
{
|
|
21
|
+
"type": "InternalLinkToHTML",
|
|
22
|
+
"fileRelativeToRoot": "en\\flight_modes\\README.md",
|
|
23
|
+
"link": {
|
|
24
|
+
"url": "../flight_modes/position_fw.html",
|
|
25
|
+
"text": "Position"
|
|
26
|
+
},
|
|
27
|
+
"hideReason": "n"
|
|
28
|
+
},
|
|
29
|
+
{
|
|
30
|
+
"type": "InternalLinkToHTML",
|
|
31
|
+
"fileRelativeToRoot": "en\\flight_modes\\README.md",
|
|
32
|
+
"link": {
|
|
33
|
+
"url": "../flight_modes/stabilized_fw.html",
|
|
34
|
+
"text": "Stabilized"
|
|
35
|
+
},
|
|
36
|
+
"hideReason": "nnnn"
|
|
37
|
+
},
|
|
38
|
+
{
|
|
39
|
+
"type": "InternalLinkToHTML",
|
|
40
|
+
"fileRelativeToRoot": "en\\flight_modes\\README.md",
|
|
41
|
+
"link": {
|
|
42
|
+
"url": "../flight_modes/acro_fw.html",
|
|
43
|
+
"text": "Acro"
|
|
44
|
+
},
|
|
45
|
+
"hideReason": "nnnnnn"
|
|
46
|
+
},
|
|
47
|
+
{
|
|
48
|
+
"type": "InternalLinkToHTML",
|
|
49
|
+
"fileRelativeToRoot": "en\\flight_modes\\README.md",
|
|
50
|
+
"link": {
|
|
51
|
+
"url": "../flight_modes/manual_fw.html",
|
|
52
|
+
"text": "Manual"
|
|
53
|
+
},
|
|
54
|
+
"hideReason": "nnnnnnn"
|
|
55
|
+
},
|
|
56
|
+
{
|
|
57
|
+
"type": "InternalLinkToHTML",
|
|
58
|
+
"fileRelativeToRoot": "en\\flight_modes\\README.md",
|
|
59
|
+
"link": {
|
|
60
|
+
"url": "../flight_modes/position_mc.html",
|
|
61
|
+
"text": "Position"
|
|
62
|
+
},
|
|
63
|
+
"hideReason": "y"
|
|
64
|
+
},
|
|
65
|
+
{
|
|
66
|
+
"type": "InternalLinkToHTML",
|
|
67
|
+
"fileRelativeToRoot": "en\\flight_modes\\README.md",
|
|
68
|
+
"link": {
|
|
69
|
+
"url": "../flight_modes/altitude_mc.html",
|
|
70
|
+
"text": "Altitude"
|
|
71
|
+
},
|
|
72
|
+
"hideReason": "y"
|
|
73
|
+
},
|
|
74
|
+
{
|
|
75
|
+
"type": "InternalLinkToHTML",
|
|
76
|
+
"fileRelativeToRoot": "en\\flight_modes\\README.md",
|
|
77
|
+
"link": {
|
|
78
|
+
"url": "../flight_modes/manual_stabilized_mc.html",
|
|
79
|
+
"text": "Manual/ Stabilized"
|
|
80
|
+
},
|
|
81
|
+
"hideReason": "y"
|
|
82
|
+
},
|
|
83
|
+
{
|
|
84
|
+
"type": "InternalLinkToHTML",
|
|
85
|
+
"fileRelativeToRoot": "en\\flight_modes\\README.md",
|
|
86
|
+
"link": {
|
|
87
|
+
"url": "../flight_modes/acro_mc.html",
|
|
88
|
+
"text": "Acro"
|
|
89
|
+
},
|
|
90
|
+
"hideReason": "y"
|
|
91
|
+
},
|
|
92
|
+
{
|
|
93
|
+
"type": "InternalLinkToHTML",
|
|
94
|
+
"fileRelativeToRoot": "en\\flight_modes\\README.md",
|
|
95
|
+
"link": {
|
|
96
|
+
"url": "../flight_modes/orbit.html",
|
|
97
|
+
"text": "Orbit"
|
|
98
|
+
},
|
|
99
|
+
"hideReason": "y"
|
|
100
|
+
},
|
|
101
|
+
{
|
|
102
|
+
"type": "InternalLinkToHTML",
|
|
103
|
+
"fileRelativeToRoot": "en\\flight_modes\\README.md",
|
|
104
|
+
"link": {
|
|
105
|
+
"url": "../flight_modes/takeoff.html",
|
|
106
|
+
"text": "Takeoff"
|
|
107
|
+
},
|
|
108
|
+
"hideReason": "y"
|
|
109
|
+
},
|
|
110
|
+
{
|
|
111
|
+
"type": "InternalLinkToHTML",
|
|
112
|
+
"fileRelativeToRoot": "en\\flight_modes\\README.md",
|
|
113
|
+
"link": {
|
|
114
|
+
"url": "../flight_modes/land.html",
|
|
115
|
+
"text": "Land"
|
|
116
|
+
},
|
|
117
|
+
"hideReason": "y"
|
|
118
|
+
},
|
|
119
|
+
{
|
|
120
|
+
"type": "InternalLinkToHTML",
|
|
121
|
+
"fileRelativeToRoot": "en\\flight_modes\\README.md",
|
|
122
|
+
"link": {
|
|
123
|
+
"url": "../flight_modes/hold.html",
|
|
124
|
+
"text": "Hold"
|
|
125
|
+
},
|
|
126
|
+
"hideReason": "y"
|
|
127
|
+
},
|
|
128
|
+
{
|
|
129
|
+
"type": "InternalLinkToHTML",
|
|
130
|
+
"fileRelativeToRoot": "en\\flight_modes\\README.md",
|
|
131
|
+
"link": {
|
|
132
|
+
"url": "../flight_modes/return.html",
|
|
133
|
+
"text": "Return"
|
|
134
|
+
},
|
|
135
|
+
"hideReason": "y"
|
|
136
|
+
}
|
|
137
|
+
]
|
package/index.js
CHANGED
|
@@ -64,10 +64,11 @@ program
|
|
|
64
64
|
"-u, --site_url [value]",
|
|
65
65
|
"Site base url in form dev.example.com (used to catch absolute urls to local files)"
|
|
66
66
|
)
|
|
67
|
+
.option("-o, --logtofile [value]", "Output logs to file", true)
|
|
67
68
|
.option(
|
|
68
|
-
"-
|
|
69
|
-
"
|
|
70
|
-
|
|
69
|
+
"-p, --interactive [value]",
|
|
70
|
+
"Interactively add errors to the ignore list at _link_checker_sc/ignore_errors.json",
|
|
71
|
+
false
|
|
71
72
|
)
|
|
72
73
|
|
|
73
74
|
.parse(process.argv);
|
|
@@ -83,25 +84,30 @@ sharedData.allHTMLFiles = new Set([]);
|
|
|
83
84
|
sharedData.allImageFiles = new Set([]);
|
|
84
85
|
sharedData.allOtherFiles = new Set([]);
|
|
85
86
|
|
|
86
|
-
const markdownDirectory = path.join(
|
|
87
|
+
const markdownDirectory = path.join(
|
|
88
|
+
sharedData.options.root,
|
|
89
|
+
sharedData.options.directory
|
|
90
|
+
);
|
|
87
91
|
|
|
88
92
|
// Function for loading JSON file that contains files to report on
|
|
89
93
|
async function loadJSONFileToReportOn(filePath) {
|
|
90
94
|
sharedData.options.log.includes("functions")
|
|
91
95
|
? console.log(`Function: loadJSONFileToReportOn(): filePath: ${filePath}`)
|
|
92
96
|
: null;
|
|
93
|
-
|
|
97
|
+
sharedData.options.log.includes("quick")
|
|
94
98
|
? console.log(`Function: loadJSONFileToReportOn(): filePath: ${filePath}`)
|
|
95
99
|
: null;
|
|
96
100
|
try {
|
|
97
101
|
const fileContent = await fs.promises.readFile(filePath, "utf8");
|
|
98
102
|
let filesArray = JSON.parse(fileContent);
|
|
99
103
|
// Array relative to root, so update to have full path
|
|
100
|
-
filesArray = filesArray.map((str) =>
|
|
101
|
-
|
|
104
|
+
filesArray = filesArray.map((str) =>
|
|
105
|
+
path.join(sharedData.options.root, str)
|
|
106
|
+
);
|
|
107
|
+
|
|
102
108
|
sharedData.options.log.includes("quick")
|
|
103
|
-
|
|
104
|
-
|
|
109
|
+
? console.log(`quick:filesArray: ${filesArray}`)
|
|
110
|
+
: null;
|
|
105
111
|
|
|
106
112
|
return filesArray;
|
|
107
113
|
} catch (error) {
|
|
@@ -111,7 +117,6 @@ async function loadJSONFileToReportOn(filePath) {
|
|
|
111
117
|
}
|
|
112
118
|
}
|
|
113
119
|
|
|
114
|
-
|
|
115
120
|
const replaceDelimiter = (str, underscore) =>
|
|
116
121
|
underscore ? str.replace(/\s+/g, "_") : str.replace(/\s+/g, "-");
|
|
117
122
|
|
|
@@ -122,6 +127,8 @@ const processFile = async (file) => {
|
|
|
122
127
|
try {
|
|
123
128
|
const contents = await fs.promises.readFile(file, "utf8");
|
|
124
129
|
const resultsForFile = processMarkdown(contents, file);
|
|
130
|
+
//console.log(resultsForFile);
|
|
131
|
+
|
|
125
132
|
resultsForFile["page_file"] = file;
|
|
126
133
|
|
|
127
134
|
// Call slugify slugifyVuepress() on each of the headings
|
|
@@ -160,36 +167,87 @@ const processDirectory = async (dir) => {
|
|
|
160
167
|
if (result) {
|
|
161
168
|
results.push(result);
|
|
162
169
|
}
|
|
163
|
-
}
|
|
164
|
-
|
|
165
|
-
else if (isHTML(file)) {
|
|
170
|
+
} else if (isHTML(file)) {
|
|
166
171
|
sharedData.allHTMLFiles.add(file);
|
|
167
172
|
const result = await processFile(file);
|
|
168
173
|
if (result) {
|
|
169
174
|
results.push(result);
|
|
170
175
|
}
|
|
171
|
-
}
|
|
172
|
-
|
|
173
|
-
else if (isImage(file)) {
|
|
176
|
+
} else if (isImage(file)) {
|
|
174
177
|
sharedData.allImageFiles.add(file);
|
|
175
|
-
}
|
|
176
|
-
else {
|
|
178
|
+
} else {
|
|
177
179
|
sharedData.allOtherFiles.add(file);
|
|
178
180
|
}
|
|
179
181
|
}
|
|
180
182
|
return results;
|
|
181
183
|
};
|
|
182
184
|
|
|
185
|
+
function filterIgnoreErrors(errors) {
|
|
186
|
+
// This method removes any errors that are in the ignore errors list
|
|
187
|
+
// This list is imported from the file _link_checker_sc/ignore_errors.json
|
|
188
|
+
|
|
189
|
+
// Currently it is the pages to output, as listed in the options.files to output.
|
|
190
|
+
sharedData.options.log.includes("functions")
|
|
191
|
+
? console.log(`Function: filterIgnoreErrors()`)
|
|
192
|
+
: null;
|
|
193
|
+
|
|
194
|
+
try {
|
|
195
|
+
//sharedData.IgnoreErrors = require('./_link_checker_sc/ignore_errors.json');
|
|
196
|
+
const ignoreFromFile = fs.readFileSync(
|
|
197
|
+
"./_link_checker_sc/ignore_errors.json"
|
|
198
|
+
);
|
|
199
|
+
sharedData.IgnoreErrors = JSON.parse(ignoreFromFile);
|
|
200
|
+
//console.log(sharedData.IgnoreErrors);
|
|
201
|
+
} catch (error) {
|
|
202
|
+
//console.log("probs loading");
|
|
203
|
+
//console.log(error);
|
|
204
|
+
sharedData.IgnoreErrors = [];
|
|
205
|
+
}
|
|
206
|
+
|
|
207
|
+
const filteredErrors = errors.filter((error) => {
|
|
208
|
+
|
|
209
|
+
let returnValue = true; //All items are not filtered, by default.
|
|
210
|
+
sharedData.IgnoreErrors.forEach((ignorableError) => {
|
|
211
|
+
if (
|
|
212
|
+
error.type === ignorableError.type &&
|
|
213
|
+
error.fileRelativeToRoot === ignorableError.fileRelativeToRoot
|
|
214
|
+
) {
|
|
215
|
+
// Same file and type, so probably filter out.
|
|
216
|
+
if (!(error.link && ignorableError.link)) {
|
|
217
|
+
returnValue = false; // Neither have a link, so we match on same type
|
|
218
|
+
}
|
|
219
|
+
|
|
220
|
+
if (
|
|
221
|
+
error.link &&
|
|
222
|
+
ignorableError.link &&
|
|
223
|
+
error.link.url === ignorableError.link.url
|
|
224
|
+
) {
|
|
225
|
+
returnValue = false; // They both have a link and it is the same link
|
|
226
|
+
}
|
|
227
|
+
}
|
|
228
|
+
|
|
229
|
+
|
|
230
|
+
});
|
|
231
|
+
//if (returnValue ==false) console.log(error);
|
|
232
|
+
return returnValue;
|
|
233
|
+
});
|
|
234
|
+
|
|
235
|
+
|
|
236
|
+
return filteredErrors;
|
|
237
|
+
}
|
|
238
|
+
|
|
183
239
|
function filterErrors(errors) {
|
|
240
|
+
// This method filters all errors against settings in the command line
|
|
241
|
+
// Currently it is the pages to output, as listed in the options.files to output.
|
|
184
242
|
sharedData.options.log.includes("functions")
|
|
185
243
|
? console.log(`Function: filterErrors()`)
|
|
186
244
|
: null;
|
|
187
|
-
|
|
245
|
+
|
|
188
246
|
let filteredErrors = errors;
|
|
189
247
|
// Filter results on specified file names (if any specified)
|
|
190
248
|
//console.log(`Number pages to filter: ${sharedData.options.files.length}`);
|
|
191
249
|
if (sharedData.options.files.length > 0) {
|
|
192
|
-
|
|
250
|
+
//console.log(`USharedFileslength: ${sharedData.options.files.length}`);
|
|
193
251
|
filteredErrors = errors.filter((error) => {
|
|
194
252
|
//console.log(`UError: ${error}`);
|
|
195
253
|
//console.log(JSON.stringify(error, null, 2));
|
|
@@ -206,46 +264,61 @@ function filterErrors(errors) {
|
|
|
206
264
|
|
|
207
265
|
//main function, after options et have been set up.
|
|
208
266
|
(async () => {
|
|
209
|
-
|
|
210
|
-
|
|
267
|
+
sharedData.options.files
|
|
268
|
+
? (sharedData.options.files = await loadJSONFileToReportOn(
|
|
269
|
+
sharedData.options.files
|
|
270
|
+
))
|
|
271
|
+
: (sharedData.options.files = []);
|
|
211
272
|
|
|
212
273
|
// process containing markdown, return results which includes links, headings, id anchors
|
|
213
274
|
const results = await processDirectory(markdownDirectory);
|
|
214
275
|
|
|
215
|
-
// Process just the relative links to find errors like missing files, anchors
|
|
216
|
-
const errorsFromRelativeLinks = processRelativeLinks(results);
|
|
217
276
|
if (!results.allErrors) {
|
|
218
277
|
results.allErrors = [];
|
|
219
278
|
}
|
|
279
|
+
|
|
280
|
+
// Add errors saved with page during page parsing.
|
|
281
|
+
// Convenient to include with page earlier, but move into main errors item in results here.
|
|
282
|
+
// (we could also just have a global errors and add to that, and share it round to wherever errors are done - might have been easier).
|
|
283
|
+
const pageErrors = results.reduce((accumulator, page) => {
|
|
284
|
+
if (page.errors) {
|
|
285
|
+
accumulator.push(...page.errors);
|
|
286
|
+
}
|
|
287
|
+
return accumulator;
|
|
288
|
+
}, []);
|
|
289
|
+
|
|
290
|
+
results["allErrors"].push(...pageErrors);
|
|
291
|
+
|
|
292
|
+
// Process just the relative links to find errors like missing files, anchors
|
|
293
|
+
const errorsFromRelativeLinks = processRelativeLinks(results);
|
|
294
|
+
|
|
220
295
|
results["allErrors"].push(...errorsFromRelativeLinks);
|
|
221
296
|
|
|
222
297
|
// Process just images linked in local file system - find errors like missing images.
|
|
223
|
-
const errorsFromLocalImageLinks = await checkLocalImageLinks(
|
|
224
|
-
results
|
|
225
|
-
);
|
|
298
|
+
const errorsFromLocalImageLinks = await checkLocalImageLinks(results);
|
|
226
299
|
//console.log(errorsFromLocalImageLinks)
|
|
227
300
|
results["allErrors"].push(...errorsFromLocalImageLinks);
|
|
228
301
|
|
|
229
302
|
// Process links to current site URL - should be relative links normally.
|
|
230
|
-
const errorsFromUrlsToLocalSite = await processUrlsToLocalSource(
|
|
231
|
-
results
|
|
232
|
-
);
|
|
303
|
+
const errorsFromUrlsToLocalSite = await processUrlsToLocalSource(results);
|
|
233
304
|
//console.log(errorsFromUrlsToLocalSite)
|
|
234
305
|
results["allErrors"].push(...errorsFromUrlsToLocalSite);
|
|
235
306
|
|
|
236
307
|
// Check for page orphans - markdown files not linked anywhere and not in summary.
|
|
237
308
|
// Guesses the table of contents file if not specified in options.toc
|
|
238
|
-
sharedData.options.toc
|
|
309
|
+
sharedData.options.toc
|
|
310
|
+
? null
|
|
311
|
+
: (sharedData.options.toc = getPageWithMostLinks(results));
|
|
239
312
|
checkPageOrphans(results); // Perhaps should follow pattern of returning errors - currently updates results
|
|
240
313
|
|
|
241
|
-
const errorsGlobalImageOrphanCheck = await checkImageOrphansGlobal(
|
|
242
|
-
results
|
|
243
|
-
);
|
|
314
|
+
const errorsGlobalImageOrphanCheck = await checkImageOrphansGlobal(results);
|
|
244
315
|
results["allErrors"].push(...errorsGlobalImageOrphanCheck);
|
|
245
316
|
|
|
246
317
|
// Filter the errors based on the settings in options.
|
|
247
318
|
// At time of writing just filters on specific set of pages.
|
|
248
|
-
|
|
319
|
+
let filteredResults = filterErrors(results.allErrors);
|
|
320
|
+
// Filter out the ones we have indicated we want to ignore.
|
|
321
|
+
filteredResults = filterIgnoreErrors(filteredResults);
|
|
249
322
|
|
|
250
323
|
// Output the errors as console.logs
|
|
251
324
|
outputErrors(filteredResults);
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "markdown_link_checker_sc",
|
|
3
|
-
"version": "0.0.
|
|
3
|
+
"version": "0.0.119",
|
|
4
4
|
"description": "Markdown Link Checker",
|
|
5
5
|
"main": "index.js",
|
|
6
6
|
"scripts": {
|
|
@@ -22,6 +22,7 @@
|
|
|
22
22
|
"author": "Hamish Willee",
|
|
23
23
|
"license": "MIT",
|
|
24
24
|
"dependencies": {
|
|
25
|
-
"commander": "^10.0.0"
|
|
25
|
+
"commander": "^10.0.0",
|
|
26
|
+
"prompt-sync": "^4.2.0"
|
|
26
27
|
}
|
|
27
28
|
}
|
package/src/errors.js
CHANGED
|
@@ -10,8 +10,15 @@ class LinkError {
|
|
|
10
10
|
if (link) {
|
|
11
11
|
this.link = link;
|
|
12
12
|
this.file = this.link.page;
|
|
13
|
+
this.fileRelativeToRoot = this.link.fileRelativeToRoot;
|
|
13
14
|
} else {
|
|
14
15
|
this.file = file; // i.e. infer file from link, but if link not specified then can take passed value
|
|
16
|
+
this.fileRelativeToRoot = this.file.split(sharedData.options.root)[1];
|
|
17
|
+
this.fileRelativeToRoot =
|
|
18
|
+
this.fileRelativeToRoot.startsWith("/") ||
|
|
19
|
+
this.fileRelativeToRoot.startsWith("\\")
|
|
20
|
+
? this.fileRelativeToRoot.substring(1)
|
|
21
|
+
: this.fileRelativeToRoot;
|
|
15
22
|
}
|
|
16
23
|
}
|
|
17
24
|
|
|
@@ -125,15 +132,32 @@ class OrphanedImageError extends LinkError {
|
|
|
125
132
|
constructor({ file, link }) {
|
|
126
133
|
super({ file: file, link: link, type: "OrphanedImage" });
|
|
127
134
|
}
|
|
135
|
+
output() {
|
|
136
|
+
console.log(`- ${this.type}: Image not linked from docs: ${this.file}`);
|
|
137
|
+
}
|
|
138
|
+
}
|
|
139
|
+
|
|
140
|
+
class ReferenceForLinkNotFoundError extends LinkError {
|
|
141
|
+
constructor({ file, linkMatch, refMatch }) {
|
|
142
|
+
super({ file: file, type: "ReferenceForLinkNotFound" });
|
|
143
|
+
if (!linkMatch) {
|
|
144
|
+
throw new Error("ReferenceForLinkNotFoundError: linkMatch is required!");
|
|
145
|
+
} else {
|
|
146
|
+
this.linkMatch = linkMatch;
|
|
147
|
+
}
|
|
148
|
+
if (!refMatch) {
|
|
149
|
+
throw new Error("ReferenceForLinkNotFoundError: refMatch is required!");
|
|
150
|
+
} else {
|
|
151
|
+
this.refMatch = refMatch;
|
|
152
|
+
}
|
|
153
|
+
}
|
|
128
154
|
output() {
|
|
129
155
|
console.log(
|
|
130
|
-
`- ${this.type}:
|
|
156
|
+
`- ${this.type}: Matching reference ${this.refMatch} not found for link ${this.linkMatch}`
|
|
131
157
|
);
|
|
132
158
|
}
|
|
133
159
|
}
|
|
134
160
|
|
|
135
|
-
|
|
136
|
-
|
|
137
161
|
export {
|
|
138
162
|
LinkError,
|
|
139
163
|
CurrentFileMissingAnchorError,
|
|
@@ -144,5 +168,6 @@ export {
|
|
|
144
168
|
PageNotInTOCError,
|
|
145
169
|
PageNotLinkedInternallyError,
|
|
146
170
|
LocalImageNotFoundError,
|
|
147
|
-
OrphanedImageError
|
|
171
|
+
OrphanedImageError,
|
|
172
|
+
ReferenceForLinkNotFoundError,
|
|
148
173
|
};
|
package/src/links.js
CHANGED
|
@@ -8,11 +8,12 @@ class Link {
|
|
|
8
8
|
anchor = "";
|
|
9
9
|
params = "";
|
|
10
10
|
type = "unHandledLinkType";
|
|
11
|
-
goat = "This is a 2goat";
|
|
11
|
+
//goat = "This is a 2goat";
|
|
12
12
|
isImage = false;
|
|
13
13
|
isMarkdown = false;
|
|
14
14
|
isHTML = false;
|
|
15
15
|
isRelative = false;
|
|
16
|
+
isReferenceLink = false;
|
|
16
17
|
|
|
17
18
|
//isImage = false;
|
|
18
19
|
static linkTypes;
|
|
@@ -33,7 +34,7 @@ class Link {
|
|
|
33
34
|
]);
|
|
34
35
|
}
|
|
35
36
|
|
|
36
|
-
constructor({ page, url, type, text, title }) {
|
|
37
|
+
constructor({ page, url, type, text, title, refName, refMatch }) {
|
|
37
38
|
logFunction("Link:constructor");
|
|
38
39
|
|
|
39
40
|
if (page) {
|
|
@@ -42,6 +43,10 @@ class Link {
|
|
|
42
43
|
throw new Error("Link: page argument is required.");
|
|
43
44
|
}
|
|
44
45
|
|
|
46
|
+
// Create a relative file link for comparison
|
|
47
|
+
this.fileRelativeToRoot = this.page.split(sharedData.options.root)[1];
|
|
48
|
+
this.fileRelativeToRoot = (this.fileRelativeToRoot.startsWith('/') || this.fileRelativeToRoot.startsWith('\\')) ? this.fileRelativeToRoot.substring(1) : this.fileRelativeToRoot
|
|
49
|
+
|
|
45
50
|
if (url) {
|
|
46
51
|
this.url = url;
|
|
47
52
|
this.splitURL(this.url);
|
|
@@ -49,6 +54,11 @@ class Link {
|
|
|
49
54
|
throw new Error("Link: url argument is required.");
|
|
50
55
|
}
|
|
51
56
|
|
|
57
|
+
text ? (this.text = text) : (this.text = "");
|
|
58
|
+
title ? (this.title = title) : (this.title = "");
|
|
59
|
+
refName ? (this.refName = refName) : (this.refName = "");
|
|
60
|
+
refMatch ? (this.refMatch = refMatch) : (this.refMatch = "");
|
|
61
|
+
|
|
52
62
|
const linkTypeGuess = this.findType(); // Do to populate the isXxxx values
|
|
53
63
|
if (type) {
|
|
54
64
|
if (!Link.linkTypes.has(type)) {
|
|
@@ -62,8 +72,7 @@ class Link {
|
|
|
62
72
|
//No type specified - use type inferred from extension etc.
|
|
63
73
|
this.type = linkTypeGuess;
|
|
64
74
|
}
|
|
65
|
-
|
|
66
|
-
title ? (this.title = title) : (this.title = "");
|
|
75
|
+
|
|
67
76
|
}
|
|
68
77
|
|
|
69
78
|
// Take a URL and split to address, anchor, params
|
|
@@ -117,6 +126,8 @@ class Link {
|
|
|
117
126
|
this.isMarkdown =
|
|
118
127
|
this.address && isMarkdown(this.address) ? true : false; //only if address is true.
|
|
119
128
|
this.isHTML = this.address && isHTML(this.address) ? true : false; //only if address is true.
|
|
129
|
+
this.isReferenceLink = this.refName ? true : false; //Only if we have a reference name
|
|
130
|
+
|
|
120
131
|
const regexpTestProtocol = /^[a-z]+:/i;
|
|
121
132
|
|
|
122
133
|
//console.log(`Linkcheck1: ${this.address} `);
|
package/src/output_errors.js
CHANGED
|
@@ -3,6 +3,12 @@
|
|
|
3
3
|
import { sharedData } from "./shared_data.js";
|
|
4
4
|
import { logFunction } from "./helpers.js";
|
|
5
5
|
|
|
6
|
+
import promptSync from "prompt-sync";
|
|
7
|
+
const prompt = promptSync();
|
|
8
|
+
|
|
9
|
+
import fs from "fs";
|
|
10
|
+
import path from "path";
|
|
11
|
+
|
|
6
12
|
//Function that generates console and/or log output from an array of error objects.
|
|
7
13
|
// - `results` is an array of error objects.
|
|
8
14
|
// These will have a `type` and a `page`. They may also have other values, depending on type of error - such as linkurl
|
|
@@ -27,6 +33,7 @@ function outputErrors(results) {
|
|
|
27
33
|
}
|
|
28
34
|
}
|
|
29
35
|
|
|
36
|
+
//let updateErrors = false;
|
|
30
37
|
//console.log(sortedByPageErrors);
|
|
31
38
|
for (const page in sortedByPageErrors) {
|
|
32
39
|
let pageFromRoot;
|
|
@@ -40,9 +47,53 @@ function outputErrors(results) {
|
|
|
40
47
|
for (const error of sortedByPageErrors[page]) {
|
|
41
48
|
if (error.output) {
|
|
42
49
|
error.output();
|
|
50
|
+
|
|
51
|
+
// Add items to the errors to be ignored, if enabled.
|
|
52
|
+
if (sharedData.options.interactive) {
|
|
53
|
+
const hideError = prompt("Stop reporting on this error? (Y/N) ", "N");
|
|
54
|
+
console.log(`HideError: ${hideError}`);
|
|
55
|
+
if (!sharedData.IgnoreErrors) {
|
|
56
|
+
sharedData.IgnoreErrors = [];
|
|
57
|
+
}
|
|
58
|
+
if (hideError === "X" || hideError === "x") {
|
|
59
|
+
// Exit without saving
|
|
60
|
+
exit();
|
|
61
|
+
}
|
|
62
|
+
if (hideError === "Y" || hideError === "y") {
|
|
63
|
+
const reduceLink = {
|
|
64
|
+
url: error.link.url,
|
|
65
|
+
text: error.link.text,
|
|
66
|
+
};
|
|
67
|
+
const reduceError = {
|
|
68
|
+
type: error.type,
|
|
69
|
+
fileRelativeToRoot: error.fileRelativeToRoot,
|
|
70
|
+
link: reduceLink,
|
|
71
|
+
};
|
|
72
|
+
reduceError.hideReason = prompt("Why? (enter for now reason) ", "");
|
|
73
|
+
|
|
74
|
+
sharedData.IgnoreErrors.push(reduceError);
|
|
75
|
+
//updateErrors = true;
|
|
76
|
+
}
|
|
77
|
+
}
|
|
43
78
|
}
|
|
44
79
|
}
|
|
45
80
|
}
|
|
81
|
+
|
|
82
|
+
// Create the `_link_checker_sc` folder if it doesn't exist.
|
|
83
|
+
const dirPath = path.join(process.cwd(), "_link_checker_sc");
|
|
84
|
+
if (!fs.existsSync(dirPath) && sharedData.options.interactive) {
|
|
85
|
+
fs.mkdirSync(dirPath);
|
|
86
|
+
}
|
|
87
|
+
|
|
88
|
+
// Create create file to store the json for the errors into
|
|
89
|
+
// But only if iterative update in progress
|
|
90
|
+
if (sharedData.options.interactive) {
|
|
91
|
+
const filePath = path.join(dirPath, "ignore_errors.json");
|
|
92
|
+
fs.writeFileSync(
|
|
93
|
+
filePath,
|
|
94
|
+
JSON.stringify(sharedData.IgnoreErrors, null, 2)
|
|
95
|
+
);
|
|
96
|
+
}
|
|
46
97
|
}
|
|
47
98
|
|
|
48
99
|
export { outputErrors };
|
package/src/process_markdown.js
CHANGED
|
@@ -1,6 +1,7 @@
|
|
|
1
1
|
import { Link } from "./links.js";
|
|
2
2
|
import { sharedData } from "./shared_data.js";
|
|
3
3
|
import { logFunction } from "./helpers.js";
|
|
4
|
+
import { processReferenceLinks } from "./process_markdown_reflinks.js";
|
|
4
5
|
|
|
5
6
|
// Returns slug for a string (markdown heading) using Vuepress algorithm.
|
|
6
7
|
// Algorithm from chatgpt - needs testing.
|
|
@@ -15,7 +16,9 @@ const processMarkdown = (contents, page) => {
|
|
|
15
16
|
const urlLocalLinks = [];
|
|
16
17
|
const urlImageLinks = [];
|
|
17
18
|
const relativeImageLinks = [];
|
|
19
|
+
//const referenceLinks = [];
|
|
18
20
|
const unHandledLinkTypes = [];
|
|
21
|
+
const errors = [];
|
|
19
22
|
let redirectTo; //Pages that contain <Redirect to="string"/> links
|
|
20
23
|
|
|
21
24
|
//console.log("SHARED_DATA");
|
|
@@ -53,8 +56,21 @@ const processMarkdown = (contents, page) => {
|
|
|
53
56
|
unHandledLinkTypes,
|
|
54
57
|
page
|
|
55
58
|
);
|
|
59
|
+
|
|
60
|
+
// This gets a reference links
|
|
56
61
|
}
|
|
57
62
|
|
|
63
|
+
|
|
64
|
+
const referenceLinkInfo = processReferenceLinks(contents, page);
|
|
65
|
+
urlLinks.push(...referenceLinkInfo.urlLinks);
|
|
66
|
+
urlLocalLinks.push(...referenceLinkInfo.urlLocalLinks);
|
|
67
|
+
urlImageLinks.push(...referenceLinkInfo.urlImageLinks);
|
|
68
|
+
relativeLinks.push(...referenceLinkInfo.relativeLinks);
|
|
69
|
+
relativeImageLinks.push(...referenceLinkInfo.relativeImageLinks);
|
|
70
|
+
errors.push(...referenceLinkInfo.errors);
|
|
71
|
+
|
|
72
|
+
//errors: errors, //TODO need to also pass referenceLinkInfo.errors
|
|
73
|
+
|
|
58
74
|
// Match html tags that have an id element
|
|
59
75
|
// (another way an anchor can be created)
|
|
60
76
|
const htmlTagsWithIdsMatches = contents.match(
|
|
@@ -86,9 +102,12 @@ const processMarkdown = (contents, page) => {
|
|
|
86
102
|
relativeImageLinks,
|
|
87
103
|
unHandledLinkTypes,
|
|
88
104
|
redirectTo,
|
|
105
|
+
errors,
|
|
89
106
|
};
|
|
90
107
|
};
|
|
91
108
|
|
|
109
|
+
|
|
110
|
+
|
|
92
111
|
// Processes line, taking arrays of different link types.
|
|
93
112
|
// Update the incoming values and return
|
|
94
113
|
// Note, assumption is all links are on one line, not split across lines.
|
|
@@ -103,7 +122,7 @@ const processLineMarkdownLinks = (
|
|
|
103
122
|
unHandledLinkTypes,
|
|
104
123
|
page
|
|
105
124
|
) => {
|
|
106
|
-
logFunction(`Function:
|
|
125
|
+
logFunction(`Function: processMarkdownLinks(): page: ${page}`);
|
|
107
126
|
|
|
108
127
|
//const regex = /(?<prefix>[!@]?)\[(?<text>[^\]]+)\]\((?<url>\S+?)(?:\s+"(?<title>[^"]+)")?\)/g;
|
|
109
128
|
// Match to Markdown link OR image
|
|
@@ -199,7 +218,9 @@ const processLineMarkdownLinks = (
|
|
|
199
218
|
}
|
|
200
219
|
default: {
|
|
201
220
|
unHandledLinkTypes.push(link);
|
|
202
|
-
sharedData.options.log.includes("todo")
|
|
221
|
+
sharedData.options.log.includes("todo")
|
|
222
|
+
? console.log(`TODO: 3Unhandled link.type: ${link.type}`)
|
|
223
|
+
: null;
|
|
203
224
|
break;
|
|
204
225
|
}
|
|
205
226
|
}
|
|
@@ -224,7 +245,8 @@ const processLineMarkdownLinks = (
|
|
|
224
245
|
let linkId = "";
|
|
225
246
|
if (attributes) {
|
|
226
247
|
const titlematch = attributes.match(regexHTMLTitle);
|
|
227
|
-
linkTitle =
|
|
248
|
+
linkTitle =
|
|
249
|
+
titlematch && titlematch.groups.title ? titlematch.groups.title : "";
|
|
228
250
|
const hrefmatch = attributes.match(regexHTMLhref);
|
|
229
251
|
linkUrl = hrefmatch && hrefmatch.groups.href ? hrefmatch.groups.href : "";
|
|
230
252
|
const idMatch = attributes.match(regexHTMLid);
|
|
@@ -250,7 +272,9 @@ const processLineMarkdownLinks = (
|
|
|
250
272
|
//const link = new Link(linkUrl, linkText, linkTitle);
|
|
251
273
|
if (!linkUrl) {
|
|
252
274
|
//We should only get here for empty links.
|
|
253
|
-
console.log(
|
|
275
|
+
console.log(
|
|
276
|
+
`WWregexHTMLmatchAtag: page: ${page}, linkUrl: ${linkUrl}, linkText: ${linkText}, linkTitle: ${linkTitle}, linkType: ${linkType}`
|
|
277
|
+
);
|
|
254
278
|
}
|
|
255
279
|
|
|
256
280
|
const link = new Link({
|
|
@@ -301,7 +325,9 @@ const processLineMarkdownLinks = (
|
|
|
301
325
|
|
|
302
326
|
default: {
|
|
303
327
|
unHandledLinkTypes.push(link);
|
|
304
|
-
sharedData.options.log.includes("todo")
|
|
328
|
+
sharedData.options.log.includes("todo")
|
|
329
|
+
? console.log(`TODO: 2Unhandled link.type: ${link.type}`)
|
|
330
|
+
: null;
|
|
305
331
|
break;
|
|
306
332
|
}
|
|
307
333
|
}
|
|
@@ -316,7 +342,6 @@ const processLineMarkdownLinks = (
|
|
|
316
342
|
const regex_htmlattr_src =
|
|
317
343
|
/src\s*[=]\s*(?<quote>['"])(?<src>.*?)(?<!\\)\k<quote>/i;
|
|
318
344
|
|
|
319
|
-
|
|
320
345
|
for (const match of line.matchAll(regexHTMLImgTotal)) {
|
|
321
346
|
//console.log(`XXXXXregexHTMLImgTotals: ${match}`)
|
|
322
347
|
const attributes = match.groups.attributes;
|
|
@@ -386,7 +411,9 @@ const processLineMarkdownLinks = (
|
|
|
386
411
|
|
|
387
412
|
default: {
|
|
388
413
|
unHandledLinkTypes.push(link);
|
|
389
|
-
sharedData.options.log.includes("todo")
|
|
414
|
+
sharedData.options.log.includes("todo")
|
|
415
|
+
? console.log(`TODO: 1Unhandled link.type: ${link.type}`)
|
|
416
|
+
: null;
|
|
390
417
|
break;
|
|
391
418
|
}
|
|
392
419
|
}
|
|
@@ -0,0 +1,173 @@
|
|
|
1
|
+
import { Link } from "./links.js";
|
|
2
|
+
import { logFunction } from "./helpers.js";
|
|
3
|
+
import {
|
|
4
|
+
ReferenceForLinkNotFoundError /* CurrentFileMissingAnchorError, LinkedFileMissingAnchorError, */,
|
|
5
|
+
} from "./errors.js";
|
|
6
|
+
|
|
7
|
+
//import { sharedData } from "./shared_data.js";
|
|
8
|
+
|
|
9
|
+
// Process all content in page, generating lists of links and some errors.
|
|
10
|
+
function processReferenceLinks(content, page) {
|
|
11
|
+
logFunction(`Function: processReferenceLinks(): page: ${page}`);
|
|
12
|
+
|
|
13
|
+
// Detect reference link
|
|
14
|
+
//const regex = /^\[(.+?)\]:\s+(.+?$)/;
|
|
15
|
+
// Link label format: https://github.github.com/gfm/#link-label
|
|
16
|
+
// Link reference definition: https://github.github.com/gfm/#link-reference-definition
|
|
17
|
+
// This will only catch the "all in one line format".
|
|
18
|
+
// Within that it catches reference, url and title.
|
|
19
|
+
//const regex = /^\s{0,3}\[(?<refName>.+?)\]:\s*?(?<refUrl>.+?$)/;
|
|
20
|
+
//const regex = /^\s{0,3}[\[(?<refName>.+?)\]:\s*?(?<refUrl>.+?)$/;
|
|
21
|
+
const references = {};
|
|
22
|
+
const possibleLinks = [];
|
|
23
|
+
|
|
24
|
+
const errors = [];
|
|
25
|
+
const urlLinks = [];
|
|
26
|
+
const urlLocalLinks = [];
|
|
27
|
+
const urlImageLinks = [];
|
|
28
|
+
const relativeLinks = [];
|
|
29
|
+
const relativeImageLinks = [];
|
|
30
|
+
|
|
31
|
+
const regexReference =
|
|
32
|
+
/^\s{0,3}\[(?<capRefName>.+?)\]:\s*?(?<capRefUrl>.+?)(?:[\"'](?<capRefTitle>[^\"\']+)[\"'])?(\s*$)/; //is goodish
|
|
33
|
+
//const regex = /^\s{0,3}\[(?<refName>.+?)\]:\s*?(?<refUrl>.+?)(?:[\"'](?<refTitle>[^\"\']+)[\"'])?(\s*(?<refTrailing>\S*))?$/
|
|
34
|
+
// TODO NEED to do something about trailing text as it breaks parser.
|
|
35
|
+
//Split content into lines
|
|
36
|
+
const lines = content.split(/\r?\n/);
|
|
37
|
+
|
|
38
|
+
for (let i = 0; i < lines.length; i++) {
|
|
39
|
+
const line = lines[i];
|
|
40
|
+
|
|
41
|
+
// Match on reflinks
|
|
42
|
+
const matchstring = line.match(regexReference);
|
|
43
|
+
if (matchstring) {
|
|
44
|
+
const { capRefName, capRefUrl, capRefTitle } = matchstring.groups;
|
|
45
|
+
|
|
46
|
+
// Normalize refname (lowercase, trimmed, only onewhitespace)
|
|
47
|
+
// First reference used by default.
|
|
48
|
+
const refName = capRefName.trim().toLowerCase().replace(/\s+/g, " ");
|
|
49
|
+
const refTitle = capRefTitle ? capRefTitle : "";
|
|
50
|
+
const refUrl = capRefUrl.trim();
|
|
51
|
+
|
|
52
|
+
if (refName.length !== capRefName.length) {
|
|
53
|
+
console.log(`TODO warn check spaces on ref: ${capRefName}`);
|
|
54
|
+
}
|
|
55
|
+
|
|
56
|
+
const refItem = {
|
|
57
|
+
ref: refName,
|
|
58
|
+
url: refUrl,
|
|
59
|
+
title: refTitle,
|
|
60
|
+
captured: matchstring[0],
|
|
61
|
+
};
|
|
62
|
+
|
|
63
|
+
if (refName in references) {
|
|
64
|
+
console.log(`TODO: Error duplicate reference to print `);
|
|
65
|
+
} else {
|
|
66
|
+
references[refName] = refItem;
|
|
67
|
+
}
|
|
68
|
+
}
|
|
69
|
+
|
|
70
|
+
//Match on possible reference links.
|
|
71
|
+
// const regexWithLinkText = /(?<prefix>[!@]?)\[(?<text>[^\]]*)\][(?<reference>.*?)]/g;
|
|
72
|
+
const regexWithLinkText =
|
|
73
|
+
/(?<prefix>[!@]?)\[(?<text>.*?)\]\[(?<reference>.*?)\]/g;
|
|
74
|
+
|
|
75
|
+
const matches = line.matchAll(regexWithLinkText);
|
|
76
|
+
//console.log(`Matches: ${matches}`);
|
|
77
|
+
|
|
78
|
+
for (const match of matches) {
|
|
79
|
+
const { prefix, text, reference } = match.groups;
|
|
80
|
+
//console.log( ` Prefix: ${prefix}, Text: ${text}, Reference: ${reference}, ` );
|
|
81
|
+
const refName = reference.trim().toLowerCase().replace(/\s+/g, " ");
|
|
82
|
+
|
|
83
|
+
//Create link (possible link from ref)
|
|
84
|
+
// Note, this is just an object, not an object of type Link
|
|
85
|
+
const link = {
|
|
86
|
+
page: page,
|
|
87
|
+
text: text,
|
|
88
|
+
prefix: prefix,
|
|
89
|
+
refName: refName,
|
|
90
|
+
refMatch: reference,
|
|
91
|
+
linkMatch: match[0],
|
|
92
|
+
};
|
|
93
|
+
possibleLinks.push(link);
|
|
94
|
+
//console.log(possibleLinks);
|
|
95
|
+
}
|
|
96
|
+
}
|
|
97
|
+
//console.log(references);
|
|
98
|
+
//console.log(possibleLinks);
|
|
99
|
+
|
|
100
|
+
// Iterate through the possible links, checking for references.
|
|
101
|
+
// Create links and errors
|
|
102
|
+
possibleLinks.forEach((value) => {
|
|
103
|
+
if (value.refName in references) {
|
|
104
|
+
//console.log("Ref exists for link:");
|
|
105
|
+
//console.log(references[value.refName]);
|
|
106
|
+
|
|
107
|
+
//Create link for ref links with matching ref
|
|
108
|
+
const link = new Link({
|
|
109
|
+
page: value.page,
|
|
110
|
+
url: references[value.refName].url,
|
|
111
|
+
text: value.text,
|
|
112
|
+
title: references[value.refName].title,
|
|
113
|
+
isReference: true,
|
|
114
|
+
refName: value.refName,
|
|
115
|
+
refMatch: value.linkMatch,
|
|
116
|
+
});
|
|
117
|
+
|
|
118
|
+
// TODO Save error here if there is a mismatch in prefix - i.e. prefix ! but URL is not an image.
|
|
119
|
+
// Perhaps roll that out elsewhere too.
|
|
120
|
+
// Now lets add to correct type.
|
|
121
|
+
|
|
122
|
+
//Link works out it own type, so add to the appropriate array to return:
|
|
123
|
+
switch (link.type) {
|
|
124
|
+
case "urlLink":
|
|
125
|
+
urlLinks.push(link);
|
|
126
|
+
break;
|
|
127
|
+
case "urlLocalLink":
|
|
128
|
+
urlLocalLinks.push(link);
|
|
129
|
+
break;
|
|
130
|
+
case "urlImageLink":
|
|
131
|
+
urlImageLinks.push(link);
|
|
132
|
+
break;
|
|
133
|
+
case "relativeLink":
|
|
134
|
+
relativeLinks.push(link);
|
|
135
|
+
break;
|
|
136
|
+
case "relativeImageLink":
|
|
137
|
+
relativeImageLinks.push(link);
|
|
138
|
+
break;
|
|
139
|
+
default:
|
|
140
|
+
throw new Error(
|
|
141
|
+
`processReferenceLinks: '${link.type}' link type unknown in switch statement!`
|
|
142
|
+
);
|
|
143
|
+
break;
|
|
144
|
+
}
|
|
145
|
+
} else {
|
|
146
|
+
const error = new ReferenceForLinkNotFoundError({
|
|
147
|
+
file: value.page,
|
|
148
|
+
linkMatch: value.linkMatch,
|
|
149
|
+
refMatch: value.refMatch,
|
|
150
|
+
});
|
|
151
|
+
//TODO: It is valid to have text that has referene format.
|
|
152
|
+
// Don't push error until it can be disabled by default or disabled individually.
|
|
153
|
+
//errors.push(error);
|
|
154
|
+
}
|
|
155
|
+
});
|
|
156
|
+
|
|
157
|
+
//console.log(refLinks);
|
|
158
|
+
return {
|
|
159
|
+
errors: errors,
|
|
160
|
+
urlLinks: urlLinks,
|
|
161
|
+
urlLocalLinks: urlLocalLinks,
|
|
162
|
+
urlImageLinks: urlImageLinks,
|
|
163
|
+
relativeLinks: relativeLinks,
|
|
164
|
+
relativeImageLinks: relativeImageLinks,
|
|
165
|
+
};
|
|
166
|
+
}
|
|
167
|
+
|
|
168
|
+
export { processReferenceLinks };
|
|
169
|
+
|
|
170
|
+
/*
|
|
171
|
+
|
|
172
|
+
|
|
173
|
+
*/
|
|
@@ -0,0 +1,32 @@
|
|
|
1
|
+
# Test
|
|
2
|
+
|
|
3
|
+
Run like: `node .\index.js -d tests/links/linkreference/links/`
|
|
4
|
+
|
|
5
|
+
Confirm we pick up references
|
|
6
|
+
|
|
7
|
+
- [this is the link text before references][this is the reference 1] And why not
|
|
8
|
+
- [link text before references without reference][reference does not exist] so there
|
|
9
|
+
|
|
10
|
+
Image URL tests
|
|
11
|
+
- ![image link text to non-image URL][this is the reference 1] thingy
|
|
12
|
+
- ![image link text to image URL][this ref to image url]
|
|
13
|
+
- [image url but not image link][this ref to image url]
|
|
14
|
+
|
|
15
|
+
|
|
16
|
+
Image URL tests
|
|
17
|
+
- ![image link text to non-image URL][this is the reference 1] thingy
|
|
18
|
+
- ![image link text to image URL][this ref to image url]
|
|
19
|
+
- [image url but not image link][this ref to image url]
|
|
20
|
+
- ![image link text to relative URL][rel ref to image url]
|
|
21
|
+
|
|
22
|
+
|
|
23
|
+
[this is the reference 1]: http://this.com/is/a/url/refererence
|
|
24
|
+
[this is reference 2]: http://this.com/is/a/url/refererence 'is title in singlequote'
|
|
25
|
+
|
|
26
|
+
[this ref to image url]: http://this.com/is/a/url/animage.jpg 'is title in singlequote'
|
|
27
|
+
|
|
28
|
+
|
|
29
|
+
[rel ref to image url]: ../url/arelimage.jpg 'is title in singlequote'
|
|
30
|
+
|
|
31
|
+
|
|
32
|
+
This is some text [this is the link text after reference 2][ this is reference 2] And why not
|
|
@@ -0,0 +1,32 @@
|
|
|
1
|
+
# Test
|
|
2
|
+
|
|
3
|
+
Run like: `node .\index.js -d tests/links/linkreference/`
|
|
4
|
+
|
|
5
|
+
Confirm we pick up references
|
|
6
|
+
|
|
7
|
+
[reference 1]: http://this.com/is/a/url/refererence
|
|
8
|
+
[reference 2]: http://this.com/is/a/url/refererence withtextafternoquoteisinvalid
|
|
9
|
+
[reference 3]: http://this.com/is/a/url/refererence "is title in doublequote"
|
|
10
|
+
[reference 4]: http://this.com/is/a/url/refererence 'is title in singlequote'
|
|
11
|
+
[reference 5]: http://this.com/is/a/url/refererence 'is title in singlequote but has text after' with following text.
|
|
12
|
+
[relativepathref]: /a/relative/path
|
|
13
|
+
|
|
14
|
+
[pathref with whitespace ]: /a/path/ref/first/should/be/used
|
|
15
|
+
|
|
16
|
+
[pathref with whitespace]: /a/path/ref/second/should/not/be/used
|
|
17
|
+
|
|
18
|
+
[ pathref WITH Capitals AnD whitespace ]: /a/path/ref/second/should/not/be/used
|
|
19
|
+
|
|
20
|
+
[ onespacebefore twospace threespace fourspace WITH Capitals AnD whitespace ]: /a/path/ref/second/should/not/be/used
|
|
21
|
+
|
|
22
|
+
[ pathref with whitespace]: /a/path/ref/link/but/should/not/be/used
|
|
23
|
+
|
|
24
|
+
[reference indented two spaces]: /a/path/ref
|
|
25
|
+
|
|
26
|
+
[reference indented THREEE spaces]: /a/path/ref
|
|
27
|
+
|
|
28
|
+
[reference indented 4 spaces should be ignored]: /a/path/ref
|
|
29
|
+
|
|
30
|
+
[ref with trailing text not title is error]: /a/path/ref trailingtextnot_matched_as_title.
|
|
31
|
+
|
|
32
|
+
[ref with trailing text after title is error]: /a/path/ref 'ref title text' trailingtextnot_matched_as_title_after_title .
|