refcheck 0.4.2__tar.gz → 0.4.4__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: refcheck
3
- Version: 0.4.2
3
+ Version: 0.4.4
4
4
  Summary: Tool for finding broken references and links in Markdown files.
5
5
  License-File: LICENSE
6
6
  Keywords: markdown,links,references,validator,cli
@@ -17,14 +17,15 @@ Requires-Dist: requests (>=2.32.3,<3.0.0)
17
17
  Project-URL: Repository, https://github.com/flumi3/markdown-refcheck
18
18
  Description-Content-Type: text/markdown
19
19
 
20
- # RefCheck
20
+ # Markdown RefCheck
21
21
 
22
22
  [![PyPI Downloads](https://static.pepy.tech/personalized-badge/refcheck?period=total&units=INTERNATIONAL_SYSTEM&left_color=GREY&right_color=ORANGE&left_text=downloads)](https://pepy.tech/projects/refcheck)
23
23
  ![Python](https://img.shields.io/badge/python-3.10+-blue.svg)
24
24
  [![License: MIT](https://img.shields.io/badge/License-MIT-silver.svg)](https://opensource.org/licenses/MIT)
25
25
  [![CI/CD](https://github.com/flumi3/markdown-refcheck/actions/workflows/ci-cd.yml/badge.svg)](https://github.com/flumi3/markdown-refcheck/actions/workflows/ci-cd.yml)
26
26
 
27
- RefCheck is a simple tool for finding broken references and links in Markdown files.
27
+ Markdown RefCheck is a simple tool that checks Markdown references to find any broken links.
28
+ It helps keeping your documentation free from broken section refs, missing images and files, and unavailable websites links.
28
29
 
29
30
  ```text
30
31
  usage: refcheck [OPTIONS] [PATH ...]
@@ -45,15 +46,12 @@ options:
45
46
 
46
47
  ## Features
47
48
 
48
- - 🔍 **Comprehensive Reference Detection** - Find and validate various reference patterns in Markdown files
49
+ - 🔍 **Reference Detection** - Find and validate various reference patterns in Markdown files
49
50
  - ❌ **Broken Link Highlighting** - Quickly identify broken references with clear error messages
50
- - 📁 **File Path Validation** - Support for both absolute and relative file paths to any file type
51
51
  - 🌐 **Remote URL Checking** - Validate external HTTP/HTTPS links (optional with `--check-remote`)
52
- - 🎯 **Header Reference Validation** - Verify links to specific sections within Markdown files
53
52
  - 🛠️ **User-Friendly CLI** - Simple and intuitive command-line interface
54
- - ⚙️ **CI/CD Ready** - Perfect for automated quality checks in your documentation workflows
55
53
  - 🎨 **Colored Output** - Clear, color-coded results for easy scanning (disable with `--no-color`)
56
- - 📊 **Detailed Reporting** - Summary statistics and line-by-line reference validation
54
+ - ⚙️ **CI/CD Ready** - Perfect for automated quality checks in your documentation workflows
57
55
  - 🚀 **Pre-commit Integration** - Available as a pre-commit hook for automated validation
58
56
 
59
57
  ## Installation
@@ -67,6 +65,18 @@ pip install refcheck
67
65
  pipx install refcheck
68
66
  ```
69
67
 
68
+ ## Pre-commit Integration
69
+
70
+ Add this to your `pre-commit-config.yml`:
71
+
72
+ ```yaml
73
+ - repo: https://github.com/flumi3/refcheck
74
+ rev: v0.4.4
75
+ hooks:
76
+ - id: refcheck
77
+ args: ["docs/", "--exclude", "docs/filetoexclude.md"]
78
+ ```
79
+
70
80
  ## Examples
71
81
 
72
82
  ```text
@@ -112,31 +122,12 @@ tests\sample_markdown.md:52: https://www.openai.com/logo.png
112
122
  ====================================================================
113
123
  ```
114
124
 
115
- ## Pre-commit Hook
116
-
117
- RefCheck is also available as pre-commit hook!
118
-
119
- ```yaml
120
- - repo: https://github.com/flumi3/refcheck
121
- rev: v0.4.2
122
- hooks:
123
- - id: refcheck
124
- args: ["docs/", "-e", "docs/filetoexclude.md"] # e.g. scan the docs/ folder and exclude a file
125
- ```
126
-
127
125
  For more advanced configuration options, see the [Integration Guide](docs/Integration-Guide.md).
128
126
 
129
127
  ## Contributing
130
128
 
131
- Contributions are welcome!
132
-
133
- Please see [CONTRIBUTING.md](CONTRIBUTING.md) for:
134
-
135
- - Development setup instructions
136
- - Commit message conventions
137
- - Code quality standards
138
- - Testing requirements
139
- - Pull request guidelines
129
+ Contributions are welcome!
130
+ Please see [CONTRIBUTING.md](CONTRIBUTING.md) before opening pull requests.
140
131
 
141
132
  ## Documentation
142
133
 
@@ -146,7 +137,3 @@ For more detailed information, check out the documentation:
146
137
  - [Integration Guide](docs/Integration-Guide.md) - CI/CD and workflow integration
147
138
  - [Examples](docs/Examples.md) - Real-world usage examples
148
139
 
149
- ## License
150
-
151
- This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
152
-
@@ -1,11 +1,12 @@
1
- # RefCheck
1
+ # Markdown RefCheck
2
2
 
3
3
  [![PyPI Downloads](https://static.pepy.tech/personalized-badge/refcheck?period=total&units=INTERNATIONAL_SYSTEM&left_color=GREY&right_color=ORANGE&left_text=downloads)](https://pepy.tech/projects/refcheck)
4
4
  ![Python](https://img.shields.io/badge/python-3.10+-blue.svg)
5
5
  [![License: MIT](https://img.shields.io/badge/License-MIT-silver.svg)](https://opensource.org/licenses/MIT)
6
6
  [![CI/CD](https://github.com/flumi3/markdown-refcheck/actions/workflows/ci-cd.yml/badge.svg)](https://github.com/flumi3/markdown-refcheck/actions/workflows/ci-cd.yml)
7
7
 
8
- RefCheck is a simple tool for finding broken references and links in Markdown files.
8
+ Markdown RefCheck is a simple tool that checks Markdown references to find any broken links.
9
+ It helps keeping your documentation free from broken section refs, missing images and files, and unavailable websites links.
9
10
 
10
11
  ```text
11
12
  usage: refcheck [OPTIONS] [PATH ...]
@@ -26,15 +27,12 @@ options:
26
27
 
27
28
  ## Features
28
29
 
29
- - 🔍 **Comprehensive Reference Detection** - Find and validate various reference patterns in Markdown files
30
+ - 🔍 **Reference Detection** - Find and validate various reference patterns in Markdown files
30
31
  - ❌ **Broken Link Highlighting** - Quickly identify broken references with clear error messages
31
- - 📁 **File Path Validation** - Support for both absolute and relative file paths to any file type
32
32
  - 🌐 **Remote URL Checking** - Validate external HTTP/HTTPS links (optional with `--check-remote`)
33
- - 🎯 **Header Reference Validation** - Verify links to specific sections within Markdown files
34
33
  - 🛠️ **User-Friendly CLI** - Simple and intuitive command-line interface
35
- - ⚙️ **CI/CD Ready** - Perfect for automated quality checks in your documentation workflows
36
34
  - 🎨 **Colored Output** - Clear, color-coded results for easy scanning (disable with `--no-color`)
37
- - 📊 **Detailed Reporting** - Summary statistics and line-by-line reference validation
35
+ - ⚙️ **CI/CD Ready** - Perfect for automated quality checks in your documentation workflows
38
36
  - 🚀 **Pre-commit Integration** - Available as a pre-commit hook for automated validation
39
37
 
40
38
  ## Installation
@@ -48,6 +46,18 @@ pip install refcheck
48
46
  pipx install refcheck
49
47
  ```
50
48
 
49
+ ## Pre-commit Integration
50
+
51
+ Add this to your `pre-commit-config.yml`:
52
+
53
+ ```yaml
54
+ - repo: https://github.com/flumi3/refcheck
55
+ rev: v0.4.4
56
+ hooks:
57
+ - id: refcheck
58
+ args: ["docs/", "--exclude", "docs/filetoexclude.md"]
59
+ ```
60
+
51
61
  ## Examples
52
62
 
53
63
  ```text
@@ -93,31 +103,12 @@ tests\sample_markdown.md:52: https://www.openai.com/logo.png
93
103
  ====================================================================
94
104
  ```
95
105
 
96
- ## Pre-commit Hook
97
-
98
- RefCheck is also available as pre-commit hook!
99
-
100
- ```yaml
101
- - repo: https://github.com/flumi3/refcheck
102
- rev: v0.4.2
103
- hooks:
104
- - id: refcheck
105
- args: ["docs/", "-e", "docs/filetoexclude.md"] # e.g. scan the docs/ folder and exclude a file
106
- ```
107
-
108
106
  For more advanced configuration options, see the [Integration Guide](docs/Integration-Guide.md).
109
107
 
110
108
  ## Contributing
111
109
 
112
- Contributions are welcome!
113
-
114
- Please see [CONTRIBUTING.md](CONTRIBUTING.md) for:
115
-
116
- - Development setup instructions
117
- - Commit message conventions
118
- - Code quality standards
119
- - Testing requirements
120
- - Pull request guidelines
110
+ Contributions are welcome!
111
+ Please see [CONTRIBUTING.md](CONTRIBUTING.md) before opening pull requests.
121
112
 
122
113
  ## Documentation
123
114
 
@@ -126,7 +117,3 @@ For more detailed information, check out the documentation:
126
117
  - [CLI Reference](docs/CLI-Reference.md) - Complete command-line options and usage
127
118
  - [Integration Guide](docs/Integration-Guide.md) - CI/CD and workflow integration
128
119
  - [Examples](docs/Examples.md) - Real-world usage examples
129
-
130
- ## License
131
-
132
- This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
@@ -1,6 +1,6 @@
1
1
  [tool.poetry]
2
2
  name = "refcheck"
3
- version = "0.4.2"
3
+ version = "0.4.4"
4
4
  description = "Tool for finding broken references and links in Markdown files."
5
5
  authors = ["Sebastian Flum <sebastian.flum.dev@gmail.com>"]
6
6
  readme = "README.md"
@@ -7,6 +7,7 @@ logger = logging.getLogger()
7
7
 
8
8
  CODE_BLOCK_PATTERN = re.compile(r"```(?P<content>[\s\S]*?)```")
9
9
  INLINE_CODE_PATTERN = re.compile(r"`(?P<content>[^`\n]+)`")
10
+ HTML_COMMENT_PATTERN = re.compile(r"<!--(?P<content>[\s\S]*?)-->")
10
11
 
11
12
  # Basic Markdown references
12
13
  BASIC_REFERENCE_PATTERN = re.compile(r"!*\[(?P<text>[^\]]+)\]\((?P<link>[^)]+)\)") # []() and ![]()
@@ -98,8 +99,13 @@ class MarkdownParser:
98
99
  inline_code = self._find_matches_with_line_numbers(INLINE_CODE_PATTERN, content)
99
100
  logger.info(f"Found {len(inline_code)} inline code spans.")
100
101
 
101
- # Combine code blocks and inline code for filtering
102
- all_code = code_blocks + inline_code
102
+ # Get all HTML comments, such as <!-- ... -->
103
+ logger.info("Extracting HTML comments ...")
104
+ html_comments = self._find_matches_with_line_numbers(HTML_COMMENT_PATTERN, content)
105
+ logger.info(f"Found {len(html_comments)} HTML comments.")
106
+
107
+ # Combine code blocks, inline code, and HTML comments for filtering
108
+ all_code = code_blocks + inline_code + html_comments
103
109
 
104
110
  # Get all references that look like this: [text](reference)
105
111
  logger.info("Extracting basic references ...")
@@ -139,35 +145,35 @@ class MarkdownParser:
139
145
  def _drop_code_references(
140
146
  self, references: list[ReferenceMatch], code_sections: list[ReferenceMatch]
141
147
  ) -> list[ReferenceMatch]:
142
- """Drop references that are part of code blocks or inline code."""
143
- logger.info("Dropping references that are part of code blocks or inline code ...")
148
+ """Drop references that are part of code blocks, inline code, or HTML comments."""
149
+ logger.info(
150
+ "Dropping references that are part of code blocks, inline code, or HTML comments ..."
151
+ )
144
152
 
145
- # Filter out references that are inside code blocks or inline code
153
+ # Filter out references whose source span is contained within a code/comment section span.
154
+ # Position-based comparison prevents incorrectly dropping references that share the same
155
+ # text as a commented-out or code-block reference but appear at a different location.
146
156
  filtered_references = []
147
157
  dropped_counter = 0
148
158
 
149
159
  for ref in references:
150
- is_in_code = False
151
- for code_section in code_sections:
152
- logger.debug(ref.match.group(0))
153
-
154
- # Check if reference is within the code section content
155
- if code_section.match.lastindex and code_section.match.lastindex >= 1:
156
- content = code_section.match.group(1)
157
- logger.debug(f"Code content: {content}")
158
- if ref.match.group(0) in content:
159
- logger.info(f"Dropping reference: {ref.match.group(0)}")
160
- is_in_code = True
161
- dropped_counter += 1
162
- break
163
-
164
- if not is_in_code:
160
+ is_in_filtered_section = False
161
+ for section in code_sections:
162
+ if section.match.start(0) <= ref.match.start(0) and ref.match.end(
163
+ 0
164
+ ) <= section.match.end(0):
165
+ logger.info(f"Dropping reference: {ref.match.group(0)}")
166
+ is_in_filtered_section = True
167
+ dropped_counter += 1
168
+ break
169
+
170
+ if not is_in_filtered_section:
165
171
  filtered_references.append(ref)
166
172
 
167
173
  if dropped_counter > 0:
168
174
  logger.info(f"Dropped {dropped_counter} references.")
169
175
  else:
170
- logger.info("No code references found.")
176
+ logger.info("No filtered references found.")
171
177
 
172
178
  return filtered_references
173
179
 
@@ -8,21 +8,21 @@ logger = logging.getLogger()
8
8
  IGNORE_FILE = ".refcheckignore"
9
9
 
10
10
  CHECK_IGNORE_DEFAULTS = [
11
- ".git",
12
- ".vscode",
13
- ".idea",
14
- "__pycache__",
15
- "node_modules",
16
- "venv",
17
- ".venv",
18
- ".pytest_cache",
11
+ ".git/",
12
+ ".vscode/",
13
+ ".idea/",
14
+ "__pycache__/",
15
+ "node_modules/",
16
+ "venv/",
17
+ ".venv/",
18
+ ".pytest_cache/",
19
19
  ]
20
20
 
21
21
 
22
22
  def load_exclusion_patterns() -> list[str]:
23
23
  """Read exclusions from the .refcheckignore file."""
24
24
  if not os.path.isfile(IGNORE_FILE):
25
- logger.warning(f"Could not find {IGNORE_FILE}. Using default exclusions.")
25
+ logger.info(f"Could not find {IGNORE_FILE}. Using default exclusions.")
26
26
  exclusions = CHECK_IGNORE_DEFAULTS
27
27
  else:
28
28
  logger.info(f"Reading exclusions from {IGNORE_FILE}...")
@@ -33,6 +33,21 @@ def load_exclusion_patterns() -> list[str]:
33
33
  return exclusions
34
34
 
35
35
 
36
+ def _is_path_excluded(path: str, exclude_set: set[str]) -> bool:
37
+ """Check if a path should be excluded based on the exclude set.
38
+
39
+ Handles both bare names (e.g. 'node_modules') that may appear anywhere in the
40
+ path tree, and explicit relative/absolute paths (e.g. '../some/dir').
41
+ """
42
+ for ex in exclude_set:
43
+ if path == ex or path.startswith(ex + os.sep):
44
+ return True
45
+ # Bare name with no separator: match against individual path components
46
+ if os.sep not in ex and ex in path.split(os.sep):
47
+ return True
48
+ return False
49
+
50
+
36
51
  def get_markdown_files_from_dir(root_dir: str, exclude: list[str] | None = None) -> list[str]:
37
52
  """Traverse the directory to get all markdown files."""
38
53
  if exclude is None:
@@ -44,7 +59,7 @@ def get_markdown_files_from_dir(root_dir: str, exclude: list[str] | None = None)
44
59
  # Walk through the directory to get all markdown files
45
60
  for subdir, _, files in os.walk(root_dir):
46
61
  subdir_norm = os.path.normpath(subdir)
47
- if any(subdir_norm.startswith(exclude_item) for exclude_item in exclude_set):
62
+ if _is_path_excluded(subdir_norm, exclude_set):
48
63
  continue # Skip excluded directories
49
64
 
50
65
  for file in files:
@@ -76,7 +91,7 @@ def get_markdown_files_from_args(paths: list[str], exclude: list[str] | None = N
76
91
  if norm_path.endswith(".md"):
77
92
  markdown_files.add(norm_path)
78
93
  else:
79
- print(f"[!] Warning: {path} is not a valid file or directory.")
94
+ print(print_yellow(f"[!] Warning: {path} is not a valid file or directory."))
80
95
 
81
96
  return list(markdown_files)
82
97
 
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes