whitespace-format 0.0.4__py3-none-any.whl → 0.0.6__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,12 +1,12 @@
1
1
  Metadata-Version: 2.1
2
2
  Name: whitespace-format
3
- Version: 0.0.4
3
+ Version: 0.0.6
4
4
  Summary: Linter and formatter for source code files and text files
5
5
  Home-page: https://github.com/DavidPal/whitespace-format
6
6
  License: MIT
7
7
  Author: David Pal
8
8
  Author-email: davidko.pal@gmail.com
9
- Requires-Python: >=3.7.2,<4.0.0
9
+ Requires-Python: >=3.8.0,<4.0.0
10
10
  Classifier: License :: OSI Approved :: MIT License
11
11
  Classifier: Programming Language :: Python
12
12
  Classifier: Programming Language :: Python :: 3
@@ -14,6 +14,8 @@ Classifier: Programming Language :: Python :: 3.8
14
14
  Classifier: Programming Language :: Python :: 3.9
15
15
  Classifier: Programming Language :: Python :: 3.10
16
16
  Classifier: Programming Language :: Python :: 3.11
17
+ Classifier: Programming Language :: Python :: 3.12
18
+ Classifier: Programming Language :: Python :: 3.13
17
19
  Project-URL: Repository, https://github.com/DavidPal/whitespace-format
18
20
  Description-Content-Type: text/markdown
19
21
 
@@ -49,7 +51,7 @@ second time (with the same parameters) has no effect.
49
51
  pip install whitespace-format
50
52
  ```
51
53
 
52
- Installation requires Python 3.7.5 or higher.
54
+ Installation requires Python 3.8.0 or higher.
53
55
 
54
56
  ## Usage
55
57
 
@@ -61,17 +63,18 @@ whitespace-format \
61
63
  --normalize-new-line-markers \
62
64
  foo.txt my_project/
63
65
  ```
64
- The command above formats `foo.txt` and all files contained `my_project/` and
65
- its subdirectories. Files that contain `.git/` or `.idea/` in their (relative)
66
- path are excluded. For example, files in `my_project/.git/` and files in
67
- `my_project/.idea/` are excluded. Likewise, files ending with `*.pyc` are
68
- excluded.
69
-
70
- If you want only know if any changes **would be** made, add `--check-only` option:
66
+ The command above formats `foo.txt` and all files contained in `my_project/`
67
+ directory and its subdirectories. Files that contain `.git/` or `.idea/` in
68
+ their (relative) path are excluded. For example, files in `my_project/.git/`
69
+ and files in `my_project/.idea/` are excluded. Likewise, files ending with
70
+ `*.pyc` are excluded.
71
+
72
+ If you want to know only if any changes **would be** made, add `--check-only`
73
+ option:
71
74
  ```shell
72
75
  whitespace-format \
73
- --exclude ".git/|.idea/|.pyc$" \
74
76
  --check-only \
77
+ --exclude ".git/|.idea/|.pyc$" \
75
78
  --new-line-marker linux \
76
79
  --normalize-new-line-markers \
77
80
  foo.txt my_project/
@@ -93,20 +96,21 @@ The regular expression is evaluated on the path of each file.
93
96
 
94
97
  * `--add-new-line-marker-at-end-of-file` -- Add missing new line marker at end of each file.
95
98
  * `--remove-new-line-marker-from-end-of-file` -- Remove all new line marker(s) from the end of each file.
96
- This option is ignored when `--add-new-line-marker-at-end-of-file` is used.
97
- Empty lines at the end of the file are removed.
99
+ This option cannot be used in combination with `--add-new-line-marker-at-end-of-file`.
100
+ Empty lines at the end of the file are removed, i.e., this option implies `--remove-trailing-empty-lines`
101
+ option.
98
102
  * `--normalize-new-line-markers` -- Make new line markers consistent in each file
99
- by replacing `\\r\\n`, `\\n`, and `\r` with a consistent new line marker.
103
+ by replacing `\r\n`, `\n`, and `\r` with a consistent new line marker.
100
104
  * `--remove-trailing-whitespace` -- Remove whitespace at the end of each line.
101
105
  * `--remove-trailing-empty-lines` -- Remove empty lines at the end of each file.
102
- * `--new-line-marker=MARKER` -- This option specifies what new line marker to use.
103
- `MARKER` must be one of the following:
106
+ * `--new-line-marker=MARKER` -- This option specifies what new line marker to when
107
+ adding or replacing new line markers. `MARKER` must be one of the following:
104
108
  * `auto` -- Use new line marker that is the most common in each individual file.
105
109
  If no new line marker is present in the file, Linux `\n` is used.
106
110
  This is the default option.
107
- * `linux` -- Use Linux new line marker `\\n`.
108
- * `mac` -- Use Mac new line marker `\\r`.
109
- * `windows` -- Use Windows new line marker `\\r\\n`.
111
+ * `linux` -- Use Linux new line marker `\n`.
112
+ * `mac` -- Use Mac new line marker `\r`.
113
+ * `windows` -- Use Windows new line marker `\r\n`.
110
114
  * `--encoding` -- Text encoding for both reading and writing files. Default encoding is `utf-8`.
111
115
  List of supported encodings can be found at
112
116
  https://docs.python.org/3/library/codecs.html#standard-encodings
@@ -161,7 +165,7 @@ Default value is `-1`.
161
165
 
162
166
  * `--normalize-non-standard-whitespace=MODE` -- Replace or remove
163
167
  non-standard whitespace characters (`\v` and `\f`). `MODE` must be one of the following:
164
- * `ignore` -- Leave `\v` and `f` as is. This is the default option.
168
+ * `ignore` -- Leave `\v` and `\f` as is. This is the default option.
165
169
  * `replace` -- Replace any occurrence of `\v` or `\f` with a single space.
166
170
  * `remove` -- Remove all occurrences of `\v` and `\f`
167
171
 
@@ -181,13 +185,7 @@ MIT
181
185
  brew install poetry
182
186
  ```
183
187
 
184
- 3) Create Python virtual environment with the correct Python version:
185
- ```shell
186
- make install-python
187
- make create-environment
188
- ```
189
-
190
- 4) Add the following lines to `.zshrc` or `.bash_profile` and restart the terminal:
188
+ 3) Add the following lines to `.zshrc` or `.bash_profile` and restart the terminal:
191
189
  ```shell
192
190
  # Pyenv settings
193
191
  export PYENV_ROOT="$HOME/.pyenv"
@@ -196,6 +194,12 @@ MIT
196
194
  eval "$(pyenv virtualenv-init -)"
197
195
  ```
198
196
 
197
+ 4) Create Python virtual environment with the correct Python version:
198
+ ```shell
199
+ make install-python
200
+ make create-environment
201
+ ```
202
+
199
203
  5) Install all dependencies
200
204
  ```shell
201
205
  make install-dependecies
@@ -0,0 +1,6 @@
1
+ whitespace_format.py,sha256=Z-h8ePtLgn82M8Sj0OqdcIg1FNGqqkzmW23lJx4IE7E,28946
2
+ whitespace_format-0.0.6.dist-info/LICENSE,sha256=rT6UNfWDYFQc-eo65FioDJRMAyVOndtF95wNCUhkK74,1076
3
+ whitespace_format-0.0.6.dist-info/METADATA,sha256=lz3cNbXeyuTD4sTVINHswLIUNjPM3gL98bYScrAYt0k,10399
4
+ whitespace_format-0.0.6.dist-info/WHEEL,sha256=Nq82e9rUAnEjt98J6MlVmMCZb-t9cYE2Ir1kpBmnWfs,88
5
+ whitespace_format-0.0.6.dist-info/entry_points.txt,sha256=LbXoevzUZAF5MVbI2foNC9xeDjKS_Woz7VbA1ZNF5CY,60
6
+ whitespace_format-0.0.6.dist-info/RECORD,,
@@ -1,4 +1,4 @@
1
1
  Wheel-Version: 1.0
2
- Generator: poetry-core 1.7.0
2
+ Generator: poetry-core 1.9.1
3
3
  Root-Is-Purelib: true
4
4
  Tag: py3-none-any
whitespace_format.py CHANGED
@@ -3,7 +3,8 @@
3
3
  """Formatter of whitespace in text files.
4
4
 
5
5
  Author: David Pal <davidko.pal@gmail.com>
6
- Date: 2023
6
+ Date: 2023 - 2024
7
+ License: MIT License
7
8
 
8
9
  Usage:
9
10
 
@@ -13,21 +14,37 @@ Usage:
13
14
  from __future__ import annotations
14
15
 
15
16
  import argparse
16
- import copy
17
17
  import dataclasses
18
18
  import pathlib
19
19
  import re
20
20
  import sys
21
- from typing import Callable
22
- from typing import Dict
21
+ from enum import Enum
23
22
  from typing import List
23
+ from typing import Tuple
24
24
 
25
- VERSION = "0.0.4"
25
+ VERSION = "0.0.6"
26
26
 
27
27
  # Regular expression that does NOT match any string.
28
28
  UNMATCHABLE_REGEX = "$."
29
29
 
30
- END_OF_LINE_MARKERS = {
30
+ # Whitespace characters
31
+ CARRIAGE_RETURN = "\r"
32
+ LINE_FEED = "\n"
33
+ SPACE = " "
34
+ TAB = "\t"
35
+ VERTICAL_TAB = "\v"
36
+ FORM_FEED = "\f"
37
+
38
+ WHITESPACE_CHARACTERS = {
39
+ CARRIAGE_RETURN,
40
+ LINE_FEED,
41
+ SPACE,
42
+ TAB,
43
+ VERTICAL_TAB,
44
+ FORM_FEED,
45
+ }
46
+
47
+ NEW_LINE_MARKERS = {
31
48
  "windows": "\r\n",
32
49
  "linux": "\n",
33
50
  "mac": "\r",
@@ -59,403 +76,471 @@ COLORS = {
59
76
  "WHITE": "\033[97m",
60
77
  }
61
78
 
79
+ ESCAPE_TRANSLATION_TABLE = str.maketrans(
80
+ {
81
+ CARRIAGE_RETURN: "\\r",
82
+ LINE_FEED: "\\n",
83
+ TAB: "\\t",
84
+ VERTICAL_TAB: "\\v",
85
+ FORM_FEED: "\\f",
86
+ }
87
+ )
62
88
 
63
- def color_print(message: str, parsed_arguments: argparse.Namespace):
64
- """Outputs a colored message."""
65
- if parsed_arguments.quiet:
66
- return
67
- for color, code in COLORS.items():
68
- if parsed_arguments.color:
69
- message = message.replace(f"[{color}]", code)
70
- else:
71
- message = message.replace(f"[{color}]", "")
72
- print(message)
73
89
 
90
+ class ChangeType(Enum):
91
+ """Type of change that happened to a file."""
74
92
 
75
- def die(error_code: int, message: str = ""):
76
- """Exits the script."""
77
- if message:
78
- print(message)
79
- sys.exit(error_code)
93
+ # New line marker was added to the end of the file (because it was missing).
94
+ ADDED_NEW_LINE_MARKER_TO_END_OF_FILE = 1
80
95
 
96
+ # New line marker was removed from the end of the file.
97
+ REMOVED_NEW_LINE_MARKER_FROM_END_OF_FILE = 2
81
98
 
82
- def read_file_content(file_name: str, encoding: str) -> str:
83
- """Reads content of a file."""
84
- try:
85
- with open(file_name, "r", encoding=encoding) as file:
86
- return file.read()
87
- except IOError as exception:
88
- die(2, f"Cannot read file '{file_name}': {exception}")
89
- except UnicodeError as exception:
90
- die(3, f"Cannot decode file '{file_name}': {exception}")
91
- return ""
99
+ # New line marker was replaced by another one.
100
+ REPLACED_NEW_LINE_MARKER = 3
92
101
 
102
+ # White at the end of a line was removed.
103
+ REMOVED_TRAILING_WHITESPACE = 4
93
104
 
94
- def write_file(file_name: str, file_content: str, encoding: str):
95
- """Writes data to a file."""
96
- try:
97
- with open(file_name, "w", encoding=encoding) as file:
98
- file.write(file_content)
99
- except IOError as exception:
100
- die(4, f"Cannot write to file '{file_name}': {exception}")
105
+ # Empty line(s) at the end of file were removed.
106
+ REMOVED_EMPTY_LINES = 5
101
107
 
108
+ # An empty file was replaced by a file consisting of single empty line.
109
+ REPLACED_EMPTY_FILE_WITH_ONE_LINE = 6
102
110
 
103
- @dataclasses.dataclass
104
- class Line:
105
- """Line of a text file.
111
+ # A file consisting of only whitespace was replaced by an empty file.
112
+ REPLACED_WHITESPACE_ONLY_FILE_WITH_EMPTY_FILE = 7
106
113
 
107
- The line is split into two parts:
108
- 1) Content
109
- 2) End of line marker ("\n", or "\r", or "\r\n")
110
- """
114
+ # A file consisting of only whitespace was replaced by a file consisting of single empty line.
115
+ REPLACED_WHITESPACE_ONLY_FILE_WITH_ONE_LINE = 8
111
116
 
112
- content: str
113
- end_of_line_marker: str
117
+ # A tab character was replaces by space character(s).
118
+ REPLACED_TAB_WITH_SPACES = 9
114
119
 
115
- @staticmethod
116
- def create_from_string(line: str) -> Line:
117
- """Creates a line from a string.
120
+ # A tab character was removed.
121
+ REMOVED_TAB = 10
118
122
 
119
- The function splits the input into content and end_of_line_marker.
120
- """
121
- for end_of_line_marker in ["\r\n", "\n", "\r"]:
122
- if line.endswith(end_of_line_marker):
123
- return Line(line[: -len(end_of_line_marker)], end_of_line_marker)
124
- return Line(line, "")
123
+ # A non-standard whitespace character (`\f` or `\v`) was replaced by a space character.
124
+ REPLACED_NONSTANDARD_WHITESPACE = 11
125
125
 
126
+ # A non-standard whitespace character (`\f` or `\v`) was removed.
127
+ REMOVED_NONSTANDARD_WHITESPACE = 12
126
128
 
127
- def split_lines(text: str) -> List[Line]:
128
- """Splits a string into lines."""
129
- lines: List[Line] = []
130
- current_line = ""
131
- for i, char in enumerate(text):
132
- current_line += char
133
- if (char == "\n") or (
134
- (char == "\r") and ((i >= len(text) - 1) or (not text[i + 1] == "\n"))
135
- ):
136
- lines.append(Line.create_from_string(current_line))
137
- current_line = ""
138
129
 
139
- if current_line:
140
- lines.append(Line.create_from_string(current_line))
130
+ @dataclasses.dataclass
131
+ class Change:
132
+ """Description of a change of the content of a file."""
141
133
 
142
- return lines
134
+ change_type: ChangeType
135
+ line_number: int
136
+ changed_from: str = ""
137
+ changed_to: str = ""
143
138
 
139
+ def message(self, check_only: bool) -> str:
140
+ """Returns a message describing the change."""
141
+ check_only_word = " would be " if check_only else " "
144
142
 
145
- def concatenate_lines(lines: List[Line]) -> str:
146
- """Concatenates a list of lines into a single string including end-of-line markers."""
147
- return "".join(line.content + line.end_of_line_marker for line in lines)
143
+ if self.change_type == ChangeType.ADDED_NEW_LINE_MARKER_TO_END_OF_FILE:
144
+ return f"New line marker{check_only_word}added to the end of the file."
148
145
 
146
+ if self.change_type == ChangeType.REMOVED_NEW_LINE_MARKER_FROM_END_OF_FILE:
147
+ return f"New line marker{check_only_word}removed from the end of the file."
149
148
 
150
- def guess_end_of_line_marker(lines: List[Line]) -> str:
151
- """Guesses the end of line marker.
149
+ if self.change_type == ChangeType.REPLACED_NEW_LINE_MARKER:
150
+ return (
151
+ f"New line marker '{escape_chars(self.changed_from)}'"
152
+ f"{check_only_word}replaced by '{escape_chars(self.changed_to)}'."
153
+ )
152
154
 
153
- The function returns the most common end-of-line marker.
154
- Ties are broken in order Linux "\n", Mac "\r", Windows "\r\n".
155
- If no end-of-line marker is present, default to the Linux "\n" end-of-line marker.
156
- """
157
- counts: Dict[str, int] = {"\n": 0, "\r": 0, "\r\n": 0}
158
- for line in lines:
159
- if line.end_of_line_marker in counts:
160
- counts[line.end_of_line_marker] += 1
161
- max_count = max(counts.values())
162
- for end_of_line_marker, count in counts.items():
163
- if count == max_count:
164
- return end_of_line_marker
165
- return "\n" # This return statement is never executed.
166
-
167
-
168
- def remove_trailing_empty_lines(lines: List[Line]) -> List[Line]:
169
- """Removes trailing empty lines.
170
-
171
- If there are no lines, empty list is returned.
172
- If all lines are empty, the first line is kept.
173
- """
174
- num_empty_trailing_lines = 0
175
- while (num_empty_trailing_lines < len(lines) - 1) and (
176
- not lines[-num_empty_trailing_lines - 1].content
177
- ):
178
- num_empty_trailing_lines += 1
179
- return copy.deepcopy(lines[: len(lines) - num_empty_trailing_lines])
155
+ if self.change_type == ChangeType.REMOVED_TRAILING_WHITESPACE:
156
+ return f"Trailing whitespace{check_only_word}removed."
180
157
 
158
+ if self.change_type == ChangeType.REMOVED_EMPTY_LINES:
159
+ return f"Empty line(s) at the end of the file{check_only_word}removed."
181
160
 
182
- def remove_dummy_lines(lines: List[Line]) -> List[Line]:
183
- """Remove empty lines that also have empty end-of-line markers."""
184
- return [line for line in lines if line.content or line.end_of_line_marker]
161
+ if self.change_type == ChangeType.REPLACED_EMPTY_FILE_WITH_ONE_LINE:
162
+ return f"Empty file{check_only_word}replaced with a single empty line."
185
163
 
164
+ if self.change_type == ChangeType.REPLACED_WHITESPACE_ONLY_FILE_WITH_EMPTY_FILE:
165
+ return f"File{check_only_word}replaced with an empty file."
186
166
 
187
- def remove_trailing_whitespace(lines: List[Line]) -> List[Line]:
188
- """Removes trailing whitespace from every line."""
189
- lines = [
190
- Line(
191
- re.sub(r"[ \n\r\t\f\v]*$", "", line.content),
192
- line.end_of_line_marker,
193
- )
194
- for line in lines
195
- ]
196
- return remove_dummy_lines(lines)
167
+ if self.change_type == ChangeType.REPLACED_WHITESPACE_ONLY_FILE_WITH_ONE_LINE:
168
+ return f"File{check_only_word}replaced with a single empty line."
197
169
 
170
+ if self.change_type == ChangeType.REPLACED_TAB_WITH_SPACES:
171
+ return f"Tab{check_only_word}replaced with spaces."
198
172
 
199
- def normalize_end_of_line_markers(lines: List[Line], new_end_of_line_marker: str) -> List[Line]:
200
- """Replaces end-of-line marker in all lines with a new end-of-line marker.
173
+ if self.change_type == ChangeType.REMOVED_TAB:
174
+ return f"Tab{check_only_word}removed."
201
175
 
202
- Lines without end-of-line markers (i.e. possibly the last line) are left unchanged.
203
- """
204
- return [
205
- Line(line.content, new_end_of_line_marker) if line.end_of_line_marker else line
206
- for line in lines
207
- ]
176
+ if self.change_type == ChangeType.REPLACED_NONSTANDARD_WHITESPACE:
177
+ return (
178
+ f"Non-standard whitespace character '{escape_chars(self.changed_from)}'"
179
+ f"{check_only_word}replaced by a space."
180
+ )
208
181
 
182
+ if self.change_type == ChangeType.REMOVED_NONSTANDARD_WHITESPACE:
183
+ return f"Non-standard whitespace character '{escape_chars(self.changed_from)}'{check_only_word}removed."
209
184
 
210
- def remove_all_end_of_line_markers_from_end_of_file(lines: List[Line]) -> List[Line]:
211
- """Removes all end-of-line markers from the end of the file."""
212
- lines = remove_trailing_empty_lines(lines)
213
- if not lines:
214
- return []
215
- lines[-1] = Line(lines[-1].content, "")
216
- return remove_dummy_lines(lines)
185
+ raise ValueError(f"Unknown change type: {self.change_type}")
217
186
 
187
+ def color_print(self, parsed_arguments: argparse.Namespace) -> None:
188
+ """Prints a message in color."""
189
+ color_print(
190
+ f"[BOLD][BLUE]↳ line {self.line_number + 1}: "
191
+ f"[WHITE]{self.message(parsed_arguments.check_only)}[RESET_ALL]",
192
+ parsed_arguments,
193
+ )
218
194
 
219
- def add_end_of_line_marker_at_end_of_file(
220
- lines: List[Line], new_end_of_line_marker: str
221
- ) -> List[Line]:
222
- """Adds new end-of-line marker to the end of file if it is missing."""
223
- if not lines:
224
- return [Line("", new_end_of_line_marker)]
225
- lines = copy.deepcopy(lines)
226
- lines[-1] = Line(lines[-1].content, new_end_of_line_marker)
227
- return lines
228
195
 
196
+ def color_print(message: str, parsed_arguments: argparse.Namespace):
197
+ """Outputs a colored message."""
198
+ if parsed_arguments.quiet:
199
+ return
200
+ for color, code in COLORS.items():
201
+ if parsed_arguments.color:
202
+ message = message.replace(f"[{color}]", code)
203
+ else:
204
+ message = message.replace(f"[{color}]", "")
205
+ print(message)
229
206
 
230
- def normalize_empty_file(lines: List[Line], mode: str, new_end_of_line_marker: str) -> List[Line]:
231
- """Replaces file with an empty file."""
232
- if mode == "empty":
233
- return []
234
- if mode == "one-line":
235
- return [Line("", new_end_of_line_marker)]
236
- return copy.deepcopy(lines)
237
207
 
208
+ def escape_chars(text: str) -> str:
209
+ """Escapes special characters in a string."""
210
+ return text.translate(ESCAPE_TRANSLATION_TABLE)
238
211
 
239
- def is_whitespace_only(lines: List[Line]) -> bool:
240
- """Determines if file consists only of whitespace."""
241
- for line in lines:
242
- if line.content.strip(" \n\r\t\v\f"):
243
- return False
244
- return True
245
212
 
213
+ def string_to_hex(text: str) -> str:
214
+ """Converts a string into a human-readable hexadecimal representation.
246
215
 
247
- def normalize_non_standard_whitespace(lines: List[Line], mode: str) -> List[Line]:
248
- """Removes non-standard whitespace characters."""
249
- if mode == "ignore":
250
- return copy.deepcopy(lines)
251
- if mode == "replace":
252
- return [
253
- Line(line.content.translate(str.maketrans("\v\f", " ", "")), line.end_of_line_marker)
254
- for line in lines
255
- ]
256
- return [
257
- Line(line.content.translate(str.maketrans("", "", "\v\f")), line.end_of_line_marker)
258
- for line in lines
259
- ]
216
+ This function is for debugging purposes only. It is used only during development.
217
+ """
218
+ return ":".join(f"{ord(character):02x}" for character in text)
260
219
 
261
220
 
262
- def replace_tabs_with_spaces(lines: List[Line], num_spaces: int) -> List[Line]:
263
- """Replaces tabs with spaces."""
264
- if num_spaces < 0:
265
- return copy.deepcopy(lines)
266
- return [
267
- Line(line.content.replace("\t", num_spaces * " "), line.end_of_line_marker)
268
- for line in lines
269
- ]
221
+ def die(error_code: int, message: str = ""):
222
+ """Exits the script."""
223
+ if message:
224
+ print(message)
225
+ sys.exit(error_code)
270
226
 
271
227
 
272
- def compute_difference(original_lines: List[Line], new_lines: List[Line]) -> List[int]:
273
- """Computes the indices of lines that differ."""
274
- line_numbers = [
275
- line_number
276
- for line_number, (original_line, new_line) in enumerate(zip(original_lines, new_lines))
277
- if not original_line == new_line
278
- ]
279
- if len(original_lines) != len(new_lines):
280
- line_numbers.append(min(len(original_lines), len(new_lines)))
281
- return line_numbers
228
+ def read_file_content(file_name: str, encoding: str) -> str:
229
+ """Reads content of a file.
282
230
 
231
+ New line markers are preserved in their original form.
232
+ """
233
+ try:
234
+ with open(file_name, "r", encoding=encoding, newline="") as file:
235
+ return file.read()
236
+ except IOError as exception:
237
+ die(2, f"Cannot read file '{file_name}': {exception}")
238
+ except UnicodeError as exception:
239
+ die(3, f"Cannot decode file '{file_name}': {exception}")
240
+ return ""
283
241
 
284
- @dataclasses.dataclass
285
- class ChangeDescription:
286
- """Description of a change of the content of a file."""
287
242
 
288
- check_only: str
289
- change: str
243
+ def write_file(file_name: str, file_content: str, encoding: str):
244
+ """Writes data to a file."""
245
+ try:
246
+ with open(file_name, "w", encoding=encoding) as file:
247
+ file.write(file_content)
248
+ except IOError as exception:
249
+ die(4, f"Cannot write to file '{file_name}': {exception}")
290
250
 
291
251
 
292
- @dataclasses.dataclass
293
- class LineChange:
294
- """Description of a change on a particular line."""
252
+ def is_whitespace_only(text: str) -> bool:
253
+ """Determines if a string consists of only whitespace characters."""
254
+ for char in text:
255
+ if char not in WHITESPACE_CHARACTERS:
256
+ return False
257
+ return True
295
258
 
296
- check_only: str
297
- change: str
298
- line_number: int
299
259
 
260
+ def find_most_common_new_line_marker(text: str) -> str:
261
+ """Returns the most common new line marker in a string.
300
262
 
301
- class FileContentTracker:
302
- """Tracks changes of the content of a file as it undergoes formatting."""
263
+ If there are ties, prefer Linux '\n' to Windows '\r\n' to Mac '\r'.
264
+ If there are no new line markers, return Linux.
303
265
 
304
- def __init__(self, lines: List[Line]):
305
- """Initializes an instance of the file content tracker."""
306
- self.initial_lines = lines
307
- self.lines = copy.deepcopy(lines)
308
- self.line_changes: List[LineChange] = []
266
+ Args:
267
+ text: A string.
268
+
269
+ Returns:
270
+ Either '\n', or '\r\n' or '\r'.
271
+ """
272
+ linux_count = 0
273
+ mac_count = 0
274
+ windows_count = 0
275
+ i = 0
309
276
 
310
- def format(self, change: ChangeDescription, function: Callable[..., List[Line]], *args):
311
- """Applies a change to the content of the file."""
312
- previous_content = self.lines
313
- self.lines = function(self.lines, *args)
314
- if previous_content != self.lines:
315
- line_numbers = compute_difference(previous_content, self.lines)
316
- for line_number in line_numbers:
317
- self.line_changes.append(LineChange(change.check_only, change.change, line_number))
277
+ while i < len(text):
278
+ if text[i] == CARRIAGE_RETURN:
279
+ if i < len(text) - 1 and text[i + 1] == LINE_FEED:
280
+ windows_count += 1
281
+ i += 1
282
+ else:
283
+ mac_count += 1
284
+ elif text[i] == LINE_FEED:
285
+ linux_count += 1
286
+ i += 1
318
287
 
319
- def is_changed(self) -> bool:
320
- """Determines if the file content has changed."""
321
- return self.lines != self.initial_lines
288
+ if mac_count > windows_count and mac_count > linux_count:
289
+ return "\r"
290
+
291
+ if windows_count > linux_count:
292
+ return "\r\n"
293
+
294
+ return "\n"
322
295
 
323
296
 
324
297
  def format_file_content(
325
- file_content_tracker: FileContentTracker,
298
+ file_content: str,
326
299
  parsed_arguments: argparse.Namespace,
327
- ):
328
- """Formats the content of file represented as a string."""
329
- new_line_marker = END_OF_LINE_MARKERS.get(
300
+ ) -> Tuple[str, List[Change]]:
301
+ """Formats content of a file.
302
+
303
+ The formatting options are specified in the parsed_arguments.
304
+
305
+ Args:
306
+ file_content: Content of the file.
307
+ parsed_arguments: Parsed command line arguments.
308
+
309
+ Returns:
310
+ A pair consisting of the formatted file content and a list of changes.
311
+ """
312
+ output_new_line_marker = NEW_LINE_MARKERS.get(
330
313
  parsed_arguments.new_line_marker,
331
- guess_end_of_line_marker(file_content_tracker.initial_lines),
314
+ find_most_common_new_line_marker(file_content),
332
315
  )
333
316
 
334
- if is_whitespace_only(file_content_tracker.initial_lines):
335
- changes = {
336
- "ignore": ChangeDescription("", ""),
337
- "empty": ChangeDescription(
338
- check_only="File needs to be replaced with an empty file.",
339
- change="File was replaced with an empty file.",
340
- ),
341
- "one-line": ChangeDescription(
342
- check_only=(
343
- f"File must be replaced with a single-line empty line {repr(new_line_marker)}."
344
- ),
345
- change=(
346
- f"File was replaced with a single-line empty line {repr(new_line_marker)}."
347
- ),
348
- ),
349
- }
350
- if not file_content_tracker.initial_lines:
351
- file_content_tracker.format(
352
- changes[parsed_arguments.normalize_empty_files],
353
- normalize_empty_file,
354
- parsed_arguments.normalize_empty_files,
355
- new_line_marker,
356
- )
357
- else:
358
- file_content_tracker.format(
359
- changes[parsed_arguments.normalize_whitespace_only_files],
360
- normalize_empty_file,
361
- parsed_arguments.normalize_whitespace_only_files,
362
- new_line_marker,
363
- )
364
-
365
- else:
366
- if parsed_arguments.remove_trailing_whitespace:
367
- file_content_tracker.format(
368
- ChangeDescription(
369
- check_only="Whitespace at the end of line needs to be removed.",
370
- change="Whitespace at the end of line was removed.",
371
- ),
372
- remove_trailing_whitespace,
373
- )
317
+ # Handle empty file:
318
+ if not file_content:
319
+ if parsed_arguments.normalize_empty_files in ["ignore", "empty"]:
320
+ return "", []
321
+ if parsed_arguments.normalize_empty_files == "one-line":
322
+ return output_new_line_marker, [Change(ChangeType.REPLACED_EMPTY_FILE_WITH_ONE_LINE, 1)]
323
+
324
+ # Handle non-empty file consisting of whitespace only.
325
+ if is_whitespace_only(file_content):
326
+ if parsed_arguments.normalize_whitespace_only_files == "empty":
327
+ return "", [Change(ChangeType.REPLACED_WHITESPACE_ONLY_FILE_WITH_EMPTY_FILE, 1)]
328
+ if parsed_arguments.normalize_whitespace_only_files == "one-line":
329
+ if file_content == output_new_line_marker:
330
+ return file_content, []
331
+ return output_new_line_marker, [
332
+ Change(ChangeType.REPLACED_WHITESPACE_ONLY_FILE_WITH_ONE_LINE, 1)
333
+ ]
334
+ if parsed_arguments.normalize_whitespace_only_files == "ignore":
335
+ return file_content, []
336
+
337
+ # Index into the input buffer.
338
+ i = 0
339
+
340
+ # List of changes
341
+ changes: List[Change] = []
342
+
343
+ # Line number. It is incremented every time we encounter a new end of line marker.
344
+ line_number = 1
345
+
346
+ # Position one character past the end of last line in the output buffer
347
+ # including the last end of line marker.
348
+ last_end_of_line_including_eol_marker = 0
349
+
350
+ # Position one character past the last non-whitespace character in the output buffer.
351
+ last_non_whitespace = 0
352
+
353
+ # Position one character past the end of last non-empty line in the output buffer
354
+ # excluding the last end of line marker.
355
+ last_end_of_non_empty_line_excluding_eol_marker = 0
356
+
357
+ # Position one character past the end of last non-empty line in the output buffer,
358
+ # including the last end of line marker.
359
+ last_end_of_non_empty_line_including_eol_marker = 0
360
+
361
+ # Line number of the last non-empty line.
362
+ last_non_empty_line_number = 0
363
+
364
+ # Formatted output
365
+ output = ""
366
+
367
+ while i < len(file_content):
368
+ if file_content[i] in [CARRIAGE_RETURN, LINE_FEED]:
369
+ # Parse the new line marker
370
+ new_line_marker = ""
371
+ if file_content[i] == LINE_FEED:
372
+ new_line_marker = LINE_FEED
373
+ elif i < len(file_content) - 1 and file_content[i + 1] == LINE_FEED:
374
+ new_line_marker = "\r\n"
375
+ # Windows new line marker consists of two characters.
376
+ # Skip the extra character.
377
+ i += 1
378
+ else:
379
+ new_line_marker = CARRIAGE_RETURN
380
+
381
+ # Remove trailing whitespace
382
+ if parsed_arguments.remove_trailing_whitespace and max(
383
+ last_non_whitespace, last_end_of_line_including_eol_marker
384
+ ) < len(output):
385
+ changes.append(
386
+ Change(
387
+ ChangeType.REMOVED_TRAILING_WHITESPACE,
388
+ line_number,
389
+ )
390
+ )
391
+ output = output[
392
+ : max(
393
+ last_non_whitespace,
394
+ last_end_of_line_including_eol_marker,
395
+ )
396
+ ]
397
+
398
+ # Determine if the last line is empty
399
+ is_empty_line: bool = last_end_of_line_including_eol_marker == len(output)
400
+
401
+ # Position one character past the end of last line in the output buffer
402
+ # excluding the last end of line marker.
403
+ last_end_of_line_excluding_eol_marker = len(output)
404
+
405
+ # Add new line marker
406
+ if (
407
+ parsed_arguments.normalize_new_line_markers
408
+ and output_new_line_marker != new_line_marker
409
+ ):
410
+ changes.append(
411
+ Change(
412
+ ChangeType.REPLACED_NEW_LINE_MARKER,
413
+ line_number,
414
+ new_line_marker,
415
+ output_new_line_marker,
416
+ )
417
+ )
418
+ output += output_new_line_marker
419
+ else:
420
+ output += new_line_marker
374
421
 
375
- if parsed_arguments.remove_trailing_empty_lines:
376
- file_content_tracker.format(
377
- ChangeDescription(
378
- check_only="Empty line(s) at the end of file need to be removed.",
379
- change="Empty line(s) at the end of file were removed.",
380
- ),
381
- remove_trailing_empty_lines,
382
- )
422
+ last_end_of_line_including_eol_marker = len(output)
383
423
 
384
- file_content_tracker.format(
385
- ChangeDescription(
386
- check_only="Tabs need to be replaced with spaces.",
387
- change="Tabs were replaced by spaces.",
388
- ),
389
- replace_tabs_with_spaces,
390
- parsed_arguments.replace_tabs_with_spaces,
391
- )
424
+ # Update position of last non-empty line.
425
+ if not is_empty_line:
426
+ last_end_of_non_empty_line_excluding_eol_marker = (
427
+ last_end_of_line_excluding_eol_marker
428
+ )
429
+ last_end_of_non_empty_line_including_eol_marker = (
430
+ last_end_of_line_including_eol_marker
431
+ )
432
+ last_non_empty_line_number = line_number
433
+
434
+ line_number += 1
435
+
436
+ elif file_content[i] == SPACE:
437
+ output += file_content[i]
438
+
439
+ elif file_content[i] == TAB:
440
+ if parsed_arguments.replace_tabs_with_spaces < 0:
441
+ output += file_content[i]
442
+ elif parsed_arguments.replace_tabs_with_spaces > 0:
443
+ changes.append(Change(ChangeType.REPLACED_TAB_WITH_SPACES, line_number))
444
+ output += SPACE * parsed_arguments.replace_tabs_with_spaces
445
+ else:
446
+ # Remove the tab character.
447
+ changes.append(Change(ChangeType.REMOVED_TAB, line_number))
448
+
449
+ elif file_content[i] in [VERTICAL_TAB, FORM_FEED]:
450
+ if parsed_arguments.normalize_non_standard_whitespace == "ignore":
451
+ output += file_content[i]
452
+ elif parsed_arguments.normalize_non_standard_whitespace == "replace":
453
+ output += SPACE
454
+ changes.append(
455
+ Change(
456
+ ChangeType.REPLACED_NONSTANDARD_WHITESPACE,
457
+ line_number,
458
+ file_content[i],
459
+ SPACE,
460
+ )
461
+ )
462
+ elif parsed_arguments.normalize_non_standard_whitespace == "remove":
463
+ changes.append(
464
+ Change(
465
+ ChangeType.REMOVED_NONSTANDARD_WHITESPACE, line_number, file_content[i], ""
466
+ )
467
+ )
468
+ else:
469
+ raise ValueError("Unknown value of normalize_non_standard_whitespace")
470
+ else:
471
+ output += file_content[i]
472
+ last_non_whitespace = len(output)
392
473
 
393
- file_content_tracker.format(
394
- ChangeDescription(
395
- check_only=(
396
- "Non-standard whitespace characters need to be removed or replaced by spaces."
397
- ),
398
- change="Non-standard whitespace characters were removed or replaced by spaces.",
399
- ),
400
- normalize_non_standard_whitespace,
401
- parsed_arguments.normalize_non_standard_whitespace,
402
- )
474
+ # Move to the next character
475
+ i += 1
403
476
 
404
- if parsed_arguments.normalize_new_line_markers:
405
- file_content_tracker.format(
406
- ChangeDescription(
407
- check_only=(
408
- f"New line marker(s) need to be replaced with {repr(new_line_marker)}."
409
- ),
410
- change=f"New line marker(s) were replaced with {repr(new_line_marker)}.",
411
- ),
412
- normalize_end_of_line_markers,
413
- new_line_marker,
414
- )
477
+ # Remove trailing whitespace from the last line.
478
+ if (
479
+ parsed_arguments.remove_trailing_whitespace
480
+ and last_end_of_line_including_eol_marker < len(output)
481
+ and last_non_whitespace < len(output)
482
+ ):
483
+ changes.append(Change(ChangeType.REMOVED_TRAILING_WHITESPACE, line_number))
484
+ output = output[:last_non_whitespace]
485
+
486
+ # Remove trailing empty lines.
487
+ if (
488
+ parsed_arguments.remove_trailing_empty_lines
489
+ and last_end_of_line_including_eol_marker == len(output)
490
+ and last_end_of_non_empty_line_including_eol_marker < len(output)
491
+ ):
492
+ line_number = last_non_empty_line_number + 1
493
+ last_end_of_line_including_eol_marker = last_end_of_non_empty_line_including_eol_marker
494
+ changes.append(Change(ChangeType.REMOVED_EMPTY_LINES, line_number))
495
+ output = output[:last_end_of_non_empty_line_including_eol_marker]
496
+
497
+ # Add new line marker at the end of the file
498
+ if (
499
+ parsed_arguments.add_new_line_marker_at_end_of_file
500
+ and last_end_of_line_including_eol_marker < len(output)
501
+ ):
502
+ changes.append(Change(ChangeType.ADDED_NEW_LINE_MARKER_TO_END_OF_FILE, line_number))
503
+ output += output_new_line_marker
504
+ last_end_of_line_including_eol_marker = len(output)
505
+ line_number += 1
506
+
507
+ # Remove new line marker(s) from the end of the file
508
+ if (
509
+ parsed_arguments.remove_new_line_marker_from_end_of_file
510
+ and last_end_of_line_including_eol_marker == len(output)
511
+ and line_number >= 2
512
+ ):
513
+ line_number = last_non_empty_line_number
514
+ changes.append(Change(ChangeType.REMOVED_NEW_LINE_MARKER_FROM_END_OF_FILE, line_number))
515
+ output = output[:last_end_of_non_empty_line_excluding_eol_marker]
415
516
 
416
- if parsed_arguments.add_new_line_marker_at_end_of_file:
417
- file_content_tracker.format(
418
- ChangeDescription(
419
- check_only=f"New line marker needs to be added to the end of the file, "
420
- f"or replaced with {repr(new_line_marker)}.",
421
- change=f"New line marker was added to the end of the file, "
422
- f"or replaced with {repr(new_line_marker)}.",
423
- ),
424
- add_end_of_line_marker_at_end_of_file,
425
- new_line_marker,
426
- )
427
- elif parsed_arguments.remove_new_line_marker_from_end_of_file:
428
- file_content_tracker.format(
429
- ChangeDescription(
430
- check_only="New line marker(s) need to removed from the end of the file.",
431
- change="New line marker(s) were removed from the end of the file.",
432
- ),
433
- remove_all_end_of_line_markers_from_end_of_file,
434
- )
517
+ return output, changes
435
518
 
436
519
 
437
520
  def reformat_file(file_name: str, parsed_arguments: argparse.Namespace) -> bool:
438
- """Reformats a file."""
521
+ """Reformats a file.
522
+
523
+ Args:
524
+ file_name: Name of the file to reformat.
525
+ parsed_arguments: Parsed command line arguments.
526
+
527
+ Returns:
528
+ True if the file was changed, False otherwise.
529
+ """
439
530
  file_content = read_file_content(file_name, parsed_arguments.encoding)
440
- lines = split_lines(file_content)
441
- file_content_tracker = FileContentTracker(lines)
442
- format_file_content(file_content_tracker, parsed_arguments)
443
- is_changed = file_content_tracker.is_changed()
531
+ formatted_file_content, file_changes = format_file_content(file_content, parsed_arguments)
444
532
  if parsed_arguments.verbose:
445
- color_print(f"Processing file '{file_name}'...", parsed_arguments)
533
+ color_print(f"[WHITE]Processing file [BOLD]{file_name}[RESET_ALL]...", parsed_arguments)
446
534
  if parsed_arguments.check_only:
447
- if is_changed:
535
+ if file_changes:
448
536
  color_print(
449
537
  f"[RED]✘[RESET_ALL] [BOLD][WHITE]{file_name} "
450
538
  f"[RED]needs to be formatted[RESET_ALL]",
451
539
  parsed_arguments,
452
540
  )
453
- for line_change in file_content_tracker.line_changes:
454
- color_print(
455
- f" [BOLD][BLUE]↳ line {line_change.line_number + 1}: "
456
- f"[WHITE]{line_change.check_only}[RESET_ALL]",
457
- parsed_arguments,
458
- )
541
+ for line_change in file_changes:
542
+ print(" ", end="")
543
+ line_change.color_print(parsed_arguments)
459
544
  else:
460
545
  if parsed_arguments.verbose:
461
546
  color_print(
@@ -464,21 +549,21 @@ def reformat_file(file_name: str, parsed_arguments: argparse.Namespace) -> bool:
464
549
  parsed_arguments,
465
550
  )
466
551
  else:
467
- if is_changed:
552
+ if file_changes:
468
553
  color_print(f"[WHITE]Reformatted [BOLD]{file_name}[RESET_ALL]", parsed_arguments)
469
- for line_change in file_content_tracker.line_changes:
470
- color_print(
471
- f" [BOLD][BLUE]↳ line {line_change.line_number + 1}: "
472
- f"[WHITE]{line_change.change}[RESET_ALL]",
473
- parsed_arguments,
474
- )
554
+ for line_change in file_changes:
555
+ print(" ", end="")
556
+ line_change.color_print(parsed_arguments)
475
557
  write_file(
476
- file_name, concatenate_lines(file_content_tracker.lines), parsed_arguments.encoding
558
+ file_name,
559
+ formatted_file_content,
560
+ parsed_arguments.encoding,
477
561
  )
478
562
  else:
479
563
  if parsed_arguments.verbose:
480
- color_print(f"[WHITE]{file_name} [BLUE]left unchanged[RESET_ALL]", parsed_arguments)
481
- return is_changed
564
+ color_print(f"[WHITE]Unchanged [BOLD]{file_name}[RESET_ALL]", parsed_arguments)
565
+
566
+ return bool(file_changes)
482
567
 
483
568
 
484
569
  def reformat_files(file_names: List[str], parsed_arguments: argparse.Namespace):
@@ -571,20 +656,24 @@ def parse_command_line() -> argparse.Namespace:
571
656
  default="utf-8",
572
657
  type=str,
573
658
  )
574
- parser.add_argument(
659
+
660
+ # Mutually exclusive group of parameters.
661
+ group1 = parser.add_mutually_exclusive_group()
662
+ group1.add_argument(
575
663
  "--verbose",
576
664
  help="Print more messages than normally.",
577
665
  required=False,
578
666
  action="store_true",
579
667
  default=False,
580
668
  )
581
- parser.add_argument(
669
+ group1.add_argument(
582
670
  "--quiet",
583
671
  help="Do not print any messages, except for errors when reading or writing files.",
584
672
  required=False,
585
673
  action="store_true",
586
674
  default=False,
587
675
  )
676
+
588
677
  parser.add_argument(
589
678
  "--color",
590
679
  help="Print messages in color.",
@@ -622,6 +711,10 @@ def parse_command_line() -> argparse.Namespace:
622
711
  "mac: Use Mac new line marker '\\r'. "
623
712
  "windows: Use Windows new line marker '\\r\\n'. "
624
713
  ),
714
+ required=False,
715
+ type=str,
716
+ choices=["auto", "linux", "mac", "windows"],
717
+ default="auto",
625
718
  )
626
719
  parser.add_argument(
627
720
  "--normalize-new-line-markers",
@@ -661,20 +754,27 @@ def parse_command_line() -> argparse.Namespace:
661
754
  default="ignore",
662
755
  choices=["ignore", "empty", "one-line"],
663
756
  )
664
- parser.add_argument(
757
+
758
+ # Mutually exclusive group of parameters.
759
+ group2 = parser.add_mutually_exclusive_group()
760
+ group2.add_argument(
665
761
  "--add-new-line-marker-at-end-of-file",
666
762
  help="Add missing new line marker at end of each file.",
667
763
  required=False,
668
764
  default=False,
669
765
  action="store_true",
670
766
  )
671
- parser.add_argument(
767
+ group2.add_argument(
672
768
  "--remove-new-line-marker-from-end-of-file",
673
- help="Remove new line markers from the end of each file.",
769
+ help="Remove new line markers from the end of each file. "
770
+ "This option conflicts with --add-new-line-marker-at-end-of-file. "
771
+ "This option implies --remove-trailing-empty-lines option, i.e., "
772
+ "all empty lines at the end of the file are removed.",
674
773
  required=False,
675
774
  default=False,
676
775
  action="store_true",
677
776
  )
777
+
678
778
  parser.add_argument(
679
779
  "--remove-trailing-whitespace",
680
780
  help="Remove whitespace at the end of each line.",
@@ -684,7 +784,8 @@ def parse_command_line() -> argparse.Namespace:
684
784
  )
685
785
  parser.add_argument(
686
786
  "--remove-trailing-empty-lines",
687
- help="Remove empty lines at the end of each file.",
787
+ help="Remove empty lines at the end of each file. "
788
+ "If --remove-new-line-marker-from-end-of-file is used, this option is used automatically.",
688
789
  required=False,
689
790
  default=False,
690
791
  action="store_true",
@@ -721,8 +822,8 @@ def parse_command_line() -> argparse.Namespace:
721
822
  if parsed_arguments.normalize_whitespace_only_files == "empty":
722
823
  parsed_arguments.normalize_empty_files = parsed_arguments.normalize_whitespace_only_files
723
824
 
724
- if parsed_arguments.verbose:
725
- parsed_arguments.quiet = False
825
+ if parsed_arguments.remove_new_line_marker_from_end_of_file:
826
+ parsed_arguments.remove_empty_lines = True
726
827
 
727
828
  return parsed_arguments
728
829
 
@@ -1,6 +0,0 @@
1
- whitespace_format.py,sha256=cSM_YnzCNsJjPGz-b9HVINv3TT8YK8C51m5RePwPNCk,25434
2
- whitespace_format-0.0.4.dist-info/LICENSE,sha256=rT6UNfWDYFQc-eo65FioDJRMAyVOndtF95wNCUhkK74,1076
3
- whitespace_format-0.0.4.dist-info/METADATA,sha256=Dk2GpHPm2ll_EvKpMxFCP86MCyU49rkYr43wBkZ7D7Q,10172
4
- whitespace_format-0.0.4.dist-info/WHEEL,sha256=d2fvjOD7sXsVzChCqf0Ty0JbHKBaLYwDbGQDwQTnJ50,88
5
- whitespace_format-0.0.4.dist-info/entry_points.txt,sha256=LbXoevzUZAF5MVbI2foNC9xeDjKS_Woz7VbA1ZNF5CY,60
6
- whitespace_format-0.0.4.dist-info/RECORD,,