npm - markdown_link_checker_sc - Versions diffs - 0.0.118 → 0.0.119 - Mend

markdown_link_checker_sc 0.0.118 → 0.0.119

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (11) hide show

package/README.md +25 -21
package/_link_checker_sc/ignore_errors.json +137 -0
package/index.js +108 -35
package/package.json +3 -2
package/src/errors.js +29 -4
package/src/links.js +15 -4
package/src/output_errors.js +51 -0
package/src/process_markdown.js +34 -7
package/src/process_markdown_reflinks.js +173 -0
package/tests/links/linkreference/links/links.md +32 -0
package/tests/links/linkreference/references.md +32 -0

package/README.md CHANGED Viewed

@@ -12,19 +12,21 @@ Current version only does internal link checking
 Usage: markdown_link_checker_sc [options]
 Options:
-  -r, --root <path>                   Root directory of your source (i.e. root of github repo). Use -d as well to specify a folder if docs are not in the root, or to just run on
-                                      particular subfolder. Defaults to current directory. (default: "D:\\github\\hamishwillee\\markdown_link_checker_sc")
-  -d, --directory [directory]         The directory to search for markdown and html files, relative to root - such as: `en` for an English subfolder. Default empty (same as -r
-                                      directory) (default: "")
-  -i, --imagedir [directory]          The directory to search for all image files for global orphan checking, relative to root - such as: `assets` or `en`. Default empty if not
-                                      explicitly set, and global orphan checking will not be done (default: "")
+  -r, --root <path>                   Root directory of your source (i.e. root of github repo). Use -d as well to specify a folder if docs are not in the root, or to just
+                                      run on particular subfolder. Defaults to current directory. (default: "D:\\github\\hamishwillee\\markdown_link_checker_sc")
+  -d, --directory [directory]         The directory to search for markdown and html files, relative to root - such as: `en` for an English subfolder. Default empty (same
+                                      as -r directory) (default: "")
+  -i, --imagedir [directory]          The directory to search for all image files for global orphan checking, relative to root - such as: `assets` or `en`. Default empty
+                                      if not explicitly set, and global orphan checking will not be done (default: "")
   -c, --headingAnchorSlugify [value]  Slugify approach for turning markdown headings into heading anchors. Currently support vuepress only and always (default: "vuepress")
   -t, --tryMarkdownforHTML [value]    Try a markdown file extension check if a link to HTML fails. (default: true)
-  -l, --log <types...>                Export logs for debugging. Types: allerrors, filterederrors, allresults etc.
-  -f, --files <path>                  JSON file with array of files to report on (default is all files). Paths are relative relative to -d by default, but -r can be used to set a
-                                      different root. (default: "")
+  -l, --log <types...>                Types of console logs to display logs for debugging. Types: functions, todo etc.
+  -f, --files <path>                  JSON file with array of files to report on (default is all files). Paths are relative relative to -d by default, but -r can be used
+                                      to set a different root. (default: "")
   -s, --toc [value]                   full filename of TOC/Summary file in file system. If not specified, inferred from file with most links to other files
   -u, --site_url [value]              Site base url in form dev.example.com (used to catch absolute urls to local files)
+  -o, --logtofile [value]             Output logs to file (default: true)
+  -p, --interactive [value]           Interactively add errors to the ignore list at _link_checker_sc/ignore_errors.json (default: false)
   -h, --help                          display help for command
 ```
@@ -37,7 +39,11 @@ Currently matches:
 - `![Image alt](url)`
 - `<a href="someurl#someanchor?someparams" title="sometitle">some text</a>`
 - `<img src="someurl" title="sometitle" />`
+- `<img src="someurl" title="sometitle" />`
+- `[reference link text][reference name]`, where the reference is define as [reference name]: reference_url "reference title"
+  - Only supports reference name and text format - not "plain reference name" like `[reference name]`
+  - reference must be all on one line, and can have up to three whitespaces before it on line. May not have text after reference title.
 > **Note:** It uses simple regexp. If you have a link commented out, or inside a code block that may well be captured.
@@ -46,25 +52,26 @@ There are heaps of link formats it does not match:
 - `<http://www.whatever.com>` - doesn't support autolinks
 - `www.fred.com` - Doesn't support auto-links external.
 - `[![image title](imageurl)](linkurl)`- Doesn't properly support a link around an image.
-- `linkreference: linkurl` - Doesn't support reference links (which would be linked like `[link text][linkreference]`
+- Reference links where the reference is defined across lines.
 Essentially lots of the other things https://github.github.com/gfm/
 The regex that drives this is very simple.
 There are many other alternatives, such as: https://github.com/tcort/markdown-link-check
 You might also use a tokenziker or round trip to HTML using something like https://marked.js.org/using_advanced#inline in future as HTML is eaiser to extract links from.
 This does catch a LOT of cases though, and is pretty quick.
-## TODO
+## Also does
-- Files passed in should be filtered to check if markdown and only use the markdown ones in the markdown areas.
-- Files passed in shoudl be filtered for image types.
-  All image types passed in should be checked to make sure they are not orphans.
+- Catches markdown files that are orphans - i.e. not linked by any file, or not linked by file which has the most links (normally the TOC file)
+- Catches orphan images
+- Allows you to specify that some errors are OK to ignore. These are stored in a file. See `-i` options\
+## TODO
 Anchors that are not url escaped can trip it up.
 - You can URL escape them like this: [Airframe Reference](#underwater_robot_underwater_robot_hippocampus_uuv_%28unmanned_underwater_vehicle%29)
@@ -73,8 +80,6 @@ Anchors that are not url escaped can trip it up.
 Anchors defined in id in a or span are caught. Need to check those in video, div are also caught and used in internal link checking.
-A way to indicate that a particular error can be ignored - e.g. by page, type, maybe by line etc. Perhaps make this something that can be turned on and off.
 Get images in/around the source files that are not linked - i.e. orphan images.
@@ -82,8 +87,7 @@ Get images in/around the source files that are not linked - i.e. orphan images.
 # How does it work?
 The way this works:
-- Specify the directory and it will searc
-h below that for all markdown/html files.
+- Specify the directory and it will search below that for all markdown/html files.
 - It loads each file, and:
   - parses for markdown and html style links for both page and image links.
   - parses headings and builds list of anchors in the page (as per vuepress) for those headings (poorly tested code)

package/_link_checker_sc/ignore_errors.json ADDED Viewed

@@ -0,0 +1,137 @@
+[
+  {
+    "type": "InternalLinkToHTML",
+    "fileRelativeToRoot": "en\\flight_modes\\README.md",
+    "link": {
+      "url": "../flight_modes/altitude_fw.html",
+      "text": "Altitude"
+    },
+    "hideReason": "Gas hasdfhasfkl"
+  },
+  {
+    "type": "LinkedFileMissingAnchor",
+    "fileRelativeToRoot": "en\\config_heli\\README.md",
+    "link": {
+      "url": "../airframes/airframe_reference.md#copter_helicopter_generic_helicopter_%28tail_esc%29",
+      "text": "Generic Helicopter - with Tail ESC"
+    },
+    "hideReason": "n"
+  },
+  {
+    "type": "InternalLinkToHTML",
+    "fileRelativeToRoot": "en\\flight_modes\\README.md",
+    "link": {
+      "url": "../flight_modes/position_fw.html",
+      "text": "Position"
+    },
+    "hideReason": "n"
+  },
+  {
+    "type": "InternalLinkToHTML",
+    "fileRelativeToRoot": "en\\flight_modes\\README.md",
+    "link": {
+      "url": "../flight_modes/stabilized_fw.html",
+      "text": "Stabilized"
+    },
+    "hideReason": "nnnn"
+  },
+  {
+    "type": "InternalLinkToHTML",
+    "fileRelativeToRoot": "en\\flight_modes\\README.md",
+    "link": {
+      "url": "../flight_modes/acro_fw.html",
+      "text": "Acro"
+    },
+    "hideReason": "nnnnnn"
+  },
+  {
+    "type": "InternalLinkToHTML",
+    "fileRelativeToRoot": "en\\flight_modes\\README.md",
+    "link": {
+      "url": "../flight_modes/manual_fw.html",
+      "text": "Manual"
+    },
+    "hideReason": "nnnnnnn"
+  },
+  {
+    "type": "InternalLinkToHTML",
+    "fileRelativeToRoot": "en\\flight_modes\\README.md",
+    "link": {
+      "url": "../flight_modes/position_mc.html",
+      "text": "Position"
+    },
+    "hideReason": "y"
+  },
+  {
+    "type": "InternalLinkToHTML",
+    "fileRelativeToRoot": "en\\flight_modes\\README.md",
+    "link": {
+      "url": "../flight_modes/altitude_mc.html",
+      "text": "Altitude"
+    },
+    "hideReason": "y"
+  },
+  {
+    "type": "InternalLinkToHTML",
+    "fileRelativeToRoot": "en\\flight_modes\\README.md",
+    "link": {
+      "url": "../flight_modes/manual_stabilized_mc.html",
+      "text": "Manual/ Stabilized"
+    },
+    "hideReason": "y"
+  },
+  {
+    "type": "InternalLinkToHTML",
+    "fileRelativeToRoot": "en\\flight_modes\\README.md",
+    "link": {
+      "url": "../flight_modes/acro_mc.html",
+      "text": "Acro"
+    },
+    "hideReason": "y"
+  },
+  {
+    "type": "InternalLinkToHTML",
+    "fileRelativeToRoot": "en\\flight_modes\\README.md",
+    "link": {
+      "url": "../flight_modes/orbit.html",
+      "text": "Orbit"
+    },
+    "hideReason": "y"
+  },
+  {
+    "type": "InternalLinkToHTML",
+    "fileRelativeToRoot": "en\\flight_modes\\README.md",
+    "link": {
+      "url": "../flight_modes/takeoff.html",
+      "text": "Takeoff"
+    },
+    "hideReason": "y"
+  },
+  {
+    "type": "InternalLinkToHTML",
+    "fileRelativeToRoot": "en\\flight_modes\\README.md",
+    "link": {
+      "url": "../flight_modes/land.html",
+      "text": "Land"
+    },
+    "hideReason": "y"
+  },
+  {
+    "type": "InternalLinkToHTML",
+    "fileRelativeToRoot": "en\\flight_modes\\README.md",
+    "link": {
+      "url": "../flight_modes/hold.html",
+      "text": "Hold"
+    },
+    "hideReason": "y"
+  },
+  {
+    "type": "InternalLinkToHTML",
+    "fileRelativeToRoot": "en\\flight_modes\\README.md",
+    "link": {
+      "url": "../flight_modes/return.html",
+      "text": "Return"
+    },
+    "hideReason": "y"
+  }
+]

package/index.js CHANGED Viewed

@@ -64,10 +64,11 @@ program
     "-u, --site_url [value]",
     "Site base url in form dev.example.com (used to catch absolute urls to local files)"
   )
+  .option("-o, --logtofile [value]", "Output logs to file", true)
   .option(
-    "-o, --logtofile [value]",
-    "Output logs to file",
-    true
+    "-p, --interactive [value]",
+    "Interactively add errors to the ignore list at _link_checker_sc/ignore_errors.json",
+    false
   )
   .parse(process.argv);
@@ -83,25 +84,30 @@ sharedData.allHTMLFiles = new Set([]);
 sharedData.allImageFiles = new Set([]);
 sharedData.allOtherFiles = new Set([]);
-const markdownDirectory = path.join(sharedData.options.root, sharedData.options.directory);
+const markdownDirectory = path.join(
+  sharedData.options.root,
+  sharedData.options.directory
+);
 // Function for loading JSON file that contains files to report on
 async function loadJSONFileToReportOn(filePath) {
   sharedData.options.log.includes("functions")
     ? console.log(`Function: loadJSONFileToReportOn(): filePath: ${filePath}`)
     : null;
-    sharedData.options.log.includes("quick")
+  sharedData.options.log.includes("quick")
     ? console.log(`Function: loadJSONFileToReportOn(): filePath: ${filePath}`)
     : null;
   try {
     const fileContent = await fs.promises.readFile(filePath, "utf8");
     let filesArray = JSON.parse(fileContent);
     // Array relative to root, so update to have full path
-    filesArray = filesArray.map((str) => path.join(sharedData.options.root, str));
+    filesArray = filesArray.map((str) =>
+      path.join(sharedData.options.root, str)
+    );
     sharedData.options.log.includes("quick")
-    ? console.log(`quick:filesArray: ${filesArray}`)
-    : null;
+      ? console.log(`quick:filesArray: ${filesArray}`)
+      : null;
     return filesArray;
   } catch (error) {
@@ -111,7 +117,6 @@ async function loadJSONFileToReportOn(filePath) {
   }
 }
 const replaceDelimiter = (str, underscore) =>
   underscore ? str.replace(/\s+/g, "_") : str.replace(/\s+/g, "-");
@@ -122,6 +127,8 @@ const processFile = async (file) => {
   try {
     const contents = await fs.promises.readFile(file, "utf8");
     const resultsForFile = processMarkdown(contents, file);
+    //console.log(resultsForFile);
     resultsForFile["page_file"] = file;
     // Call slugify slugifyVuepress() on each of the headings
@@ -160,36 +167,87 @@ const processDirectory = async (dir) => {
       if (result) {
         results.push(result);
       }
-    }
-    else if (isHTML(file)) {
+    } else if (isHTML(file)) {
       sharedData.allHTMLFiles.add(file);
       const result = await processFile(file);
       if (result) {
         results.push(result);
       }
-    }
-    else if (isImage(file)) {
+    } else if (isImage(file)) {
       sharedData.allImageFiles.add(file);
-    }
-    else {
+    } else {
       sharedData.allOtherFiles.add(file);
     }
   }
   return results;
 };
+function filterIgnoreErrors(errors) {
+  // This method removes any errors that are in the ignore errors list
+  // This list is imported from the file _link_checker_sc/ignore_errors.json
+  // Currently it is the pages to output, as listed in the options.files to output.
+  sharedData.options.log.includes("functions")
+    ? console.log(`Function: filterIgnoreErrors()`)
+    : null;
+  try {
+    //sharedData.IgnoreErrors = require('./_link_checker_sc/ignore_errors.json');
+    const ignoreFromFile = fs.readFileSync(
+      "./_link_checker_sc/ignore_errors.json"
+    );
+    sharedData.IgnoreErrors = JSON.parse(ignoreFromFile);
+    //console.log(sharedData.IgnoreErrors);
+  } catch (error) {
+    //console.log("probs loading");
+    //console.log(error);
+    sharedData.IgnoreErrors = [];
+  }
+  const filteredErrors = errors.filter((error) => {
+    let returnValue = true; //All items are not filtered, by default.
+    sharedData.IgnoreErrors.forEach((ignorableError) => {
+      if (
+        error.type === ignorableError.type &&
+        error.fileRelativeToRoot === ignorableError.fileRelativeToRoot
+      ) {
+        // Same file and type, so probably filter out.
+        if (!(error.link && ignorableError.link)) {
+          returnValue = false; // Neither have a link, so we match on same type
+        }
+        if (
+          error.link &&
+          ignorableError.link &&
+          error.link.url === ignorableError.link.url
+        ) {
+          returnValue = false; // They both have a link and it is the same link
+        }
+      }
+    });
+    //if (returnValue ==false) console.log(error);
+    return returnValue;
+  });
+  return filteredErrors;
+}
 function filterErrors(errors) {
+  // This method filters all errors against settings in the command line
+  // Currently it is the pages to output, as listed in the options.files to output.
   sharedData.options.log.includes("functions")
     ? console.log(`Function: filterErrors()`)
     : null;
-  // This method filters all errors against settings in the command line - such as pages to output.
   let filteredErrors = errors;
   // Filter results on specified file names (if any specified)
   //console.log(`Number pages to filter: ${sharedData.options.files.length}`);
   if (sharedData.options.files.length > 0) {
-	  //console.log(`USharedFileslength: ${sharedData.options.files.length}`);
+    //console.log(`USharedFileslength: ${sharedData.options.files.length}`);
     filteredErrors = errors.filter((error) => {
       //console.log(`UError: ${error}`);
       //console.log(JSON.stringify(error, null, 2));
@@ -206,46 +264,61 @@ function filterErrors(errors) {
 //main function, after options et have been set up.
 (async () => {
-  sharedData.options.files ? (sharedData.options.files = await loadJSONFileToReportOn(sharedData.options.files)) : (sharedData.options.files = []);
+  sharedData.options.files
+    ? (sharedData.options.files = await loadJSONFileToReportOn(
+        sharedData.options.files
+      ))
+    : (sharedData.options.files = []);
   // process  containing markdown, return results which includes links, headings, id anchors
   const results = await processDirectory(markdownDirectory);
-  // Process just the relative links to find errors like missing files, anchors
-  const errorsFromRelativeLinks = processRelativeLinks(results);
   if (!results.allErrors) {
     results.allErrors = [];
   }
+  // Add errors saved with page during page parsing.
+  // Convenient to include with page earlier, but move into main errors item in results here.
+  // (we could also just have a global errors and add to that, and share it round to wherever errors are done - might have been easier).
+  const pageErrors = results.reduce((accumulator, page) => {
+    if (page.errors) {
+      accumulator.push(...page.errors);
+    }
+    return accumulator;
+  }, []);
+  results["allErrors"].push(...pageErrors);
+  // Process just the relative links to find errors like missing files, anchors
+  const errorsFromRelativeLinks = processRelativeLinks(results);
   results["allErrors"].push(...errorsFromRelativeLinks);
   // Process just images linked in local file system - find errors like missing images.
-  const errorsFromLocalImageLinks = await checkLocalImageLinks(
-    results
-  );
+  const errorsFromLocalImageLinks = await checkLocalImageLinks(results);
   //console.log(errorsFromLocalImageLinks)
   results["allErrors"].push(...errorsFromLocalImageLinks);
   // Process links to current site URL - should be relative links normally.
-  const errorsFromUrlsToLocalSite = await processUrlsToLocalSource(
-    results
-  );
+  const errorsFromUrlsToLocalSite = await processUrlsToLocalSource(results);
   //console.log(errorsFromUrlsToLocalSite)
   results["allErrors"].push(...errorsFromUrlsToLocalSite);
   // Check for page orphans - markdown files not linked anywhere and not in summary.
   // Guesses the table of contents file if not specified in options.toc
-  sharedData.options.toc ? null : (sharedData.options.toc = getPageWithMostLinks(results));
+  sharedData.options.toc
+    ? null
+    : (sharedData.options.toc = getPageWithMostLinks(results));
   checkPageOrphans(results); // Perhaps should follow pattern of returning errors - currently updates results
-  const errorsGlobalImageOrphanCheck = await checkImageOrphansGlobal(
-    results
-  );
+  const errorsGlobalImageOrphanCheck = await checkImageOrphansGlobal(results);
   results["allErrors"].push(...errorsGlobalImageOrphanCheck);
   // Filter the errors based on the settings in options.
   // At time of writing just filters on specific set of pages.
-  const filteredResults = filterErrors(results.allErrors);
+  let filteredResults = filterErrors(results.allErrors);
+  // Filter out the ones we have indicated we want to ignore.
+  filteredResults = filterIgnoreErrors(filteredResults);
   // Output the errors as console.logs
   outputErrors(filteredResults);

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "markdown_link_checker_sc",
-  "version": "0.0.118",
+  "version": "0.0.119",
   "description": "Markdown Link Checker",
   "main": "index.js",
   "scripts": {
@@ -22,6 +22,7 @@
   "author": "Hamish Willee",
   "license": "MIT",
   "dependencies": {
-    "commander": "^10.0.0"
+    "commander": "^10.0.0",
+    "prompt-sync": "^4.2.0"
   }
 }

package/src/errors.js CHANGED Viewed

@@ -10,8 +10,15 @@ class LinkError {
     if (link) {
       this.link = link;
       this.file = this.link.page;
+      this.fileRelativeToRoot = this.link.fileRelativeToRoot;
     } else {
       this.file = file; // i.e. infer file from link, but if link not specified then can take passed value
+      this.fileRelativeToRoot = this.file.split(sharedData.options.root)[1];
+      this.fileRelativeToRoot =
+        this.fileRelativeToRoot.startsWith("/") ||
+        this.fileRelativeToRoot.startsWith("\\")
+          ? this.fileRelativeToRoot.substring(1)
+          : this.fileRelativeToRoot;
     }
   }
@@ -125,15 +132,32 @@ class OrphanedImageError extends LinkError {
   constructor({ file, link }) {
     super({ file: file, link: link, type: "OrphanedImage" });
   }
+  output() {
+    console.log(`- ${this.type}: Image not linked from docs: ${this.file}`);
+  }
+}
+class ReferenceForLinkNotFoundError extends LinkError {
+  constructor({ file, linkMatch, refMatch }) {
+    super({ file: file, type: "ReferenceForLinkNotFound" });
+    if (!linkMatch) {
+      throw new Error("ReferenceForLinkNotFoundError: linkMatch is required!");
+    } else {
+      this.linkMatch = linkMatch;
+    }
+    if (!refMatch) {
+      throw new Error("ReferenceForLinkNotFoundError: refMatch is required!");
+    } else {
+      this.refMatch = refMatch;
+    }
+  }
   output() {
     console.log(
-      `- ${this.type}: Image not linked from docs: ${this.file}`
+      `- ${this.type}: Matching reference ${this.refMatch} not found for link ${this.linkMatch}`
     );
   }
 }
 export {
   LinkError,
   CurrentFileMissingAnchorError,
@@ -144,5 +168,6 @@ export {
   PageNotInTOCError,
   PageNotLinkedInternallyError,
   LocalImageNotFoundError,
-  OrphanedImageError
+  OrphanedImageError,
+  ReferenceForLinkNotFoundError,
 };

package/src/links.js CHANGED Viewed

@@ -8,11 +8,12 @@ class Link {
   anchor = "";
   params = "";
   type = "unHandledLinkType";
-  goat = "This is a 2goat";
+  //goat = "This is a 2goat";
   isImage = false;
   isMarkdown = false;
   isHTML = false;
   isRelative = false;
+  isReferenceLink = false;
   //isImage = false;
   static linkTypes;
@@ -33,7 +34,7 @@ class Link {
     ]);
   }
-  constructor({ page, url, type, text, title }) {
+  constructor({ page, url, type, text, title, refName, refMatch }) {
     logFunction("Link:constructor");
     if (page) {
@@ -42,6 +43,10 @@ class Link {
       throw new Error("Link: page argument is required.");
     }
+    // Create a relative file link for comparison
+    this.fileRelativeToRoot = this.page.split(sharedData.options.root)[1];
+    this.fileRelativeToRoot = (this.fileRelativeToRoot.startsWith('/') || this.fileRelativeToRoot.startsWith('\\')) ? this.fileRelativeToRoot.substring(1) : this.fileRelativeToRoot
     if (url) {
       this.url = url;
       this.splitURL(this.url);
@@ -49,6 +54,11 @@ class Link {
       throw new Error("Link: url argument is required.");
     }
+    text ? (this.text = text) : (this.text = "");
+    title ? (this.title = title) : (this.title = "");
+    refName ? (this.refName = refName) : (this.refName = "");
+    refMatch ? (this.refMatch = refMatch) : (this.refMatch = "");
     const linkTypeGuess = this.findType(); // Do to populate the isXxxx values
     if (type) {
       if (!Link.linkTypes.has(type)) {
@@ -62,8 +72,7 @@ class Link {
       //No type specified - use type inferred from extension etc.
       this.type = linkTypeGuess;
     }
-    text ? (this.text = text) : (this.text = "");
-    title ? (this.title = title) : (this.title = "");
   }
   // Take a URL and split to address, anchor, params
@@ -117,6 +126,8 @@ class Link {
     this.isMarkdown =
       this.address && isMarkdown(this.address) ? true : false; //only if address is true.
     this.isHTML = this.address && isHTML(this.address) ? true : false; //only if address is true.
+    this.isReferenceLink = this.refName ? true : false; //Only if we have a reference name
     const regexpTestProtocol = /^[a-z]+:/i;
     //console.log(`Linkcheck1: ${this.address} `);

package/src/output_errors.js CHANGED Viewed

@@ -3,6 +3,12 @@
 import { sharedData } from "./shared_data.js";
 import { logFunction } from "./helpers.js";
+import promptSync from "prompt-sync";
+const prompt = promptSync();
+import fs from "fs";
+import path from "path";
 //Function that generates console and/or log output from an array of error objects.
 // - `results` is an array of error objects.
 //  These will have a `type` and a `page`. They may also have other values, depending on type of error - such as linkurl
@@ -27,6 +33,7 @@ function outputErrors(results) {
     }
   }
+  //let updateErrors = false;
   //console.log(sortedByPageErrors);
   for (const page in sortedByPageErrors) {
     let pageFromRoot;
@@ -40,9 +47,53 @@ function outputErrors(results) {
     for (const error of sortedByPageErrors[page]) {
       if (error.output) {
         error.output();
+        // Add items to the errors to be ignored, if enabled.
+        if (sharedData.options.interactive) {
+          const hideError = prompt("Stop reporting on this error? (Y/N) ", "N");
+          console.log(`HideError: ${hideError}`);
+          if (!sharedData.IgnoreErrors) {
+            sharedData.IgnoreErrors = [];
+          }
+          if (hideError === "X" || hideError === "x") {
+            // Exit without saving
+            exit();
+          }
+          if (hideError === "Y" || hideError === "y") {
+            const reduceLink = {
+              url: error.link.url,
+              text: error.link.text,
+            };
+            const reduceError = {
+              type: error.type,
+              fileRelativeToRoot: error.fileRelativeToRoot,
+              link: reduceLink,
+            };
+            reduceError.hideReason = prompt("Why? (enter for now reason) ", "");
+            sharedData.IgnoreErrors.push(reduceError);
+            //updateErrors = true;
+          }
+        }
       }
     }
   }
+  // Create the `_link_checker_sc` folder if it doesn't exist.
+  const dirPath = path.join(process.cwd(), "_link_checker_sc");
+  if (!fs.existsSync(dirPath) && sharedData.options.interactive) {
+    fs.mkdirSync(dirPath);
+  }
+  // Create create file to store the json for the errors into
+  // But only if iterative update in progress
+  if (sharedData.options.interactive) {
+    const filePath = path.join(dirPath, "ignore_errors.json");
+    fs.writeFileSync(
+      filePath,
+      JSON.stringify(sharedData.IgnoreErrors, null, 2)
+    );
+  }
 }
 export { outputErrors };

package/src/process_markdown.js CHANGED Viewed

@@ -1,6 +1,7 @@
 import { Link } from "./links.js";
 import { sharedData } from "./shared_data.js";
 import { logFunction } from "./helpers.js";
+import { processReferenceLinks } from "./process_markdown_reflinks.js";
 // Returns slug for a string (markdown heading) using Vuepress algorithm.
 // Algorithm from chatgpt - needs testing.
@@ -15,7 +16,9 @@ const processMarkdown = (contents, page) => {
   const urlLocalLinks = [];
   const urlImageLinks = [];
   const relativeImageLinks = [];
+  //const referenceLinks = [];
   const unHandledLinkTypes = [];
+  const errors = [];
   let redirectTo; //Pages that contain <Redirect to="string"/> links
   //console.log("SHARED_DATA");
@@ -53,8 +56,21 @@ const processMarkdown = (contents, page) => {
         unHandledLinkTypes,
         page
       );
+      // This gets a reference links
     }
+    const referenceLinkInfo = processReferenceLinks(contents, page);
+    urlLinks.push(...referenceLinkInfo.urlLinks);
+    urlLocalLinks.push(...referenceLinkInfo.urlLocalLinks);
+    urlImageLinks.push(...referenceLinkInfo.urlImageLinks);
+    relativeLinks.push(...referenceLinkInfo.relativeLinks);
+    relativeImageLinks.push(...referenceLinkInfo.relativeImageLinks);
+    errors.push(...referenceLinkInfo.errors);
+    //errors: errors, //TODO need to also pass referenceLinkInfo.errors
     // Match html tags that have an id element
     // (another way an anchor can be created)
     const htmlTagsWithIdsMatches = contents.match(
@@ -86,9 +102,12 @@ const processMarkdown = (contents, page) => {
     relativeImageLinks,
     unHandledLinkTypes,
     redirectTo,
+    errors,
   };
 };
 // Processes line, taking arrays of different link types.
 // Update the incoming values and return
 // Note, assumption is all links are on one line, not split across lines.
@@ -103,7 +122,7 @@ const processLineMarkdownLinks = (
   unHandledLinkTypes,
   page
 ) => {
-  logFunction(`Function: processMarkdown(): page: ${page}`);
+  logFunction(`Function: processMarkdownLinks(): page: ${page}`);
   //const regex = /(?<prefix>[!@]?)\[(?<text>[^\]]+)\]\((?<url>\S+?)(?:\s+"(?<title>[^"]+)")?\)/g;
   // Match to Markdown link OR image
@@ -199,7 +218,9 @@ const processLineMarkdownLinks = (
       }
       default: {
         unHandledLinkTypes.push(link);
-        sharedData.options.log.includes("todo") ? console.log(`TODO: 3Unhandled link.type: ${link.type}`) : null;
+        sharedData.options.log.includes("todo")
+          ? console.log(`TODO: 3Unhandled link.type: ${link.type}`)
+          : null;
         break;
       }
     }
@@ -224,7 +245,8 @@ const processLineMarkdownLinks = (
     let linkId = "";
     if (attributes) {
       const titlematch = attributes.match(regexHTMLTitle);
-      linkTitle = titlematch && titlematch.groups.title ? titlematch.groups.title : "";
+      linkTitle =
+        titlematch && titlematch.groups.title ? titlematch.groups.title : "";
       const hrefmatch = attributes.match(regexHTMLhref);
       linkUrl = hrefmatch && hrefmatch.groups.href ? hrefmatch.groups.href : "";
       const idMatch = attributes.match(regexHTMLid);
@@ -250,7 +272,9 @@ const processLineMarkdownLinks = (
     //const link = new Link(linkUrl, linkText, linkTitle);
     if (!linkUrl) {
       //We should only get here for empty links.
-      console.log(         `WWregexHTMLmatchAtag: page: ${page}, linkUrl: ${linkUrl}, linkText: ${linkText}, linkTitle: ${linkTitle}, linkType: ${linkType}`      );
+      console.log(
+        `WWregexHTMLmatchAtag: page: ${page}, linkUrl: ${linkUrl}, linkText: ${linkText}, linkTitle: ${linkTitle}, linkType: ${linkType}`
+      );
     }
     const link = new Link({
@@ -301,7 +325,9 @@ const processLineMarkdownLinks = (
       default: {
         unHandledLinkTypes.push(link);
-        sharedData.options.log.includes("todo") ? console.log(`TODO: 2Unhandled link.type: ${link.type}`) : null;
+        sharedData.options.log.includes("todo")
+          ? console.log(`TODO: 2Unhandled link.type: ${link.type}`)
+          : null;
         break;
       }
     }
@@ -316,7 +342,6 @@ const processLineMarkdownLinks = (
   const regex_htmlattr_src =
     /src\s*[=]\s*(?<quote>['"])(?<src>.*?)(?<!\\)\k<quote>/i;
   for (const match of line.matchAll(regexHTMLImgTotal)) {
     //console.log(`XXXXXregexHTMLImgTotals: ${match}`)
     const attributes = match.groups.attributes;
@@ -386,7 +411,9 @@ const processLineMarkdownLinks = (
       default: {
         unHandledLinkTypes.push(link);
-        sharedData.options.log.includes("todo") ? console.log(`TODO: 1Unhandled link.type: ${link.type}`) : null;
+        sharedData.options.log.includes("todo")
+          ? console.log(`TODO: 1Unhandled link.type: ${link.type}`)
+          : null;
         break;
       }
     }

package/src/process_markdown_reflinks.js ADDED Viewed

@@ -0,0 +1,173 @@
+import { Link } from "./links.js";
+import { logFunction } from "./helpers.js";
+import {
+  ReferenceForLinkNotFoundError /* CurrentFileMissingAnchorError,   LinkedFileMissingAnchorError, */,
+} from "./errors.js";
+//import { sharedData } from "./shared_data.js";
+// Process all content in page, generating lists of links and some errors.
+function processReferenceLinks(content, page) {
+  logFunction(`Function: processReferenceLinks(): page: ${page}`);
+  // Detect reference link
+  //const regex = /^\[(.+?)\]:\s+(.+?$)/;
+  // Link label format: https://github.github.com/gfm/#link-label
+  // Link reference definition: https://github.github.com/gfm/#link-reference-definition
+  //   This will only catch the "all in one line format".
+  //   Within that it catches reference, url and title.
+  //const regex = /^\s{0,3}\[(?<refName>.+?)\]:\s*?(?<refUrl>.+?$)/;
+  //const regex = /^\s{0,3}[\[(?<refName>.+?)\]:\s*?(?<refUrl>.+?)$/;
+  const references = {};
+  const possibleLinks = [];
+  const errors = [];
+  const urlLinks = [];
+  const urlLocalLinks = [];
+  const urlImageLinks = [];
+  const relativeLinks = [];
+  const relativeImageLinks = [];
+  const regexReference =
+    /^\s{0,3}\[(?<capRefName>.+?)\]:\s*?(?<capRefUrl>.+?)(?:[\"'](?<capRefTitle>[^\"\']+)[\"'])?(\s*$)/; //is goodish
+  //const regex = /^\s{0,3}\[(?<refName>.+?)\]:\s*?(?<refUrl>.+?)(?:[\"'](?<refTitle>[^\"\']+)[\"'])?(\s*(?<refTrailing>\S*))?$/
+  // TODO NEED to do something about trailing text as it breaks parser.
+  //Split content into lines
+  const lines = content.split(/\r?\n/);
+  for (let i = 0; i < lines.length; i++) {
+    const line = lines[i];
+    // Match on reflinks
+    const matchstring = line.match(regexReference);
+    if (matchstring) {
+      const { capRefName, capRefUrl, capRefTitle } = matchstring.groups;
+      // Normalize refname (lowercase, trimmed, only onewhitespace)
+      // First reference used by default.
+      const refName = capRefName.trim().toLowerCase().replace(/\s+/g, " ");
+      const refTitle = capRefTitle ? capRefTitle : "";
+      const refUrl = capRefUrl.trim();
+      if (refName.length !== capRefName.length) {
+        console.log(`TODO warn check spaces on ref: ${capRefName}`);
+      }
+      const refItem = {
+        ref: refName,
+        url: refUrl,
+        title: refTitle,
+        captured: matchstring[0],
+      };
+      if (refName in references) {
+        console.log(`TODO: Error duplicate reference to print `);
+      } else {
+        references[refName] = refItem;
+      }
+    }
+    //Match on possible reference links.
+    //   const regexWithLinkText =       /(?<prefix>[!@]?)\[(?<text>[^\]]*)\][(?<reference>.*?)]/g;
+    const regexWithLinkText =
+      /(?<prefix>[!@]?)\[(?<text>.*?)\]\[(?<reference>.*?)\]/g;
+    const matches = line.matchAll(regexWithLinkText);
+    //console.log(`Matches: ${matches}`);
+    for (const match of matches) {
+      const { prefix, text, reference } = match.groups;
+      //console.log(        ` Prefix: ${prefix}, Text: ${text}, Reference: ${reference}, `      );
+      const refName = reference.trim().toLowerCase().replace(/\s+/g, " ");
+      //Create link (possible link from ref)
+      // Note, this is just an object, not an object of type Link
+      const link = {
+        page: page,
+        text: text,
+        prefix: prefix,
+        refName: refName,
+        refMatch: reference,
+        linkMatch: match[0],
+      };
+      possibleLinks.push(link);
+      //console.log(possibleLinks);
+    }
+  }
+  //console.log(references);
+  //console.log(possibleLinks);
+  // Iterate through the possible links, checking for references.
+  // Create links and errors
+  possibleLinks.forEach((value) => {
+    if (value.refName in references) {
+      //console.log("Ref exists for link:");
+      //console.log(references[value.refName]);
+      //Create link for ref links with matching ref
+      const link = new Link({
+        page: value.page,
+        url: references[value.refName].url,
+        text: value.text,
+        title: references[value.refName].title,
+        isReference: true,
+        refName: value.refName,
+        refMatch: value.linkMatch,
+      });
+      // TODO Save error here if there is a mismatch in prefix - i.e. prefix ! but URL is not an image.
+      // Perhaps roll that out elsewhere too.
+      // Now lets add to correct type.
+      //Link works out it own type, so add to the appropriate array to return:
+      switch (link.type) {
+        case "urlLink":
+          urlLinks.push(link);
+          break;
+        case "urlLocalLink":
+          urlLocalLinks.push(link);
+          break;
+        case "urlImageLink":
+          urlImageLinks.push(link);
+          break;
+        case "relativeLink":
+          relativeLinks.push(link);
+          break;
+        case "relativeImageLink":
+          relativeImageLinks.push(link);
+          break;
+        default:
+          throw new Error(
+            `processReferenceLinks: '${link.type}' link type unknown in switch statement!`
+          );
+          break;
+      }
+    } else {
+      const error = new ReferenceForLinkNotFoundError({
+        file: value.page,
+        linkMatch: value.linkMatch,
+        refMatch: value.refMatch,
+      });
+      //TODO: It is valid to have text that has referene format.
+      // Don't push error until it can be disabled by default or disabled individually.
+      //errors.push(error);
+    }
+  });
+  //console.log(refLinks);
+  return {
+     errors: errors,
+     urlLinks: urlLinks,
+     urlLocalLinks: urlLocalLinks,
+     urlImageLinks: urlImageLinks,
+     relativeLinks: relativeLinks,
+     relativeImageLinks: relativeImageLinks,
+  };
+}
+export { processReferenceLinks };
+/*
+*/

package/tests/links/linkreference/links/links.md ADDED Viewed

@@ -0,0 +1,32 @@
+# Test
+Run like: `node .\index.js -d tests/links/linkreference/links/`
+Confirm we pick up references
+- [this is the link text before references][this is the reference 1] And why not
+- [link text before references without reference][reference does not exist] so there
+Image URL tests
+- ![image link text to non-image URL][this is the reference 1] thingy
+- ![image link text to image URL][this ref to image url]
+- [image url but not image link][this ref to image url]
+Image URL tests
+- ![image link text to non-image URL][this is the reference 1] thingy
+- ![image link text to image URL][this ref to image url]
+- [image url but not image link][this ref to image url]
+- ![image link text to relative URL][rel ref to image url]
+[this is the reference 1]: http://this.com/is/a/url/refererence
+[this is reference 2]: http://this.com/is/a/url/refererence  'is title in singlequote'
+[this ref to image url]: http://this.com/is/a/url/animage.jpg  'is title in singlequote'
+[rel ref to image url]: ../url/arelimage.jpg  'is title in singlequote'
+This is some text [this is the link text after reference 2][   this is reference   2] And why not

package/tests/links/linkreference/references.md ADDED Viewed

@@ -0,0 +1,32 @@
+# Test
+Run like: `node .\index.js -d tests/links/linkreference/`
+Confirm we pick up references
+[reference 1]: http://this.com/is/a/url/refererence
+[reference 2]: http://this.com/is/a/url/refererence  withtextafternoquoteisinvalid
+[reference 3]: http://this.com/is/a/url/refererence  "is title in doublequote"
+[reference 4]: http://this.com/is/a/url/refererence  'is title in singlequote'
+[reference 5]: http://this.com/is/a/url/refererence  'is title in singlequote but has text after'   with following text.
+[relativepathref]: /a/relative/path
+[pathref with whitespace   ]: /a/path/ref/first/should/be/used
+[pathref with  whitespace]: /a/path/ref/second/should/not/be/used
+[  pathref WITH Capitals AnD  whitespace  ]: /a/path/ref/second/should/not/be/used
+[ onespacebefore  twospace   threespace    fourspace WITH Capitals AnD  whitespace  ]: /a/path/ref/second/should/not/be/used
+[  pathref with whitespace]: /a/path/ref/link/but/should/not/be/used
+  [reference indented two spaces]: /a/path/ref
+   [reference indented THREEE spaces]: /a/path/ref
+    [reference indented 4 spaces should  be ignored]: /a/path/ref
+[ref with trailing text not title is error]: /a/path/ref trailingtextnot_matched_as_title.
+[ref with trailing text after title is error]: /a/path/ref 'ref title text' trailingtextnot_matched_as_title_after_title .