exhash 0.3.2__tar.gz → 0.3.3__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- {exhash-0.3.2 → exhash-0.3.3}/Cargo.lock +3 -3
- {exhash-0.3.2 → exhash-0.3.3}/Cargo.toml +1 -1
- {exhash-0.3.2 → exhash-0.3.3}/DEV.md +8 -2
- {exhash-0.3.2 → exhash-0.3.3}/PKG-INFO +51 -8
- {exhash-0.3.2 → exhash-0.3.3}/README.md +50 -7
- {exhash-0.3.2 → exhash-0.3.3}/pyproject.toml +3 -1
- exhash-0.3.3/python/exhash/__init__.py +337 -0
- exhash-0.3.3/python/exhash/skill.py +44 -0
- {exhash-0.3.2 → exhash-0.3.3}/src/bin/exhash.rs +50 -30
- {exhash-0.3.2 → exhash-0.3.3}/src/engine.rs +67 -44
- {exhash-0.3.2 → exhash-0.3.3}/src/lib.rs +6 -2
- {exhash-0.3.2 → exhash-0.3.3}/src/lnhash.rs +42 -10
- {exhash-0.3.2 → exhash-0.3.3}/src/parse.rs +77 -39
- {exhash-0.3.2 → exhash-0.3.3}/src/python.rs +31 -12
- {exhash-0.3.2 → exhash-0.3.3}/tests/cli.rs +67 -9
- {exhash-0.3.2 → exhash-0.3.3}/tests/test_exhash.py +111 -0
- exhash-0.3.2/python/exhash/__init__.py +0 -99
- {exhash-0.3.2 → exhash-0.3.3}/.github/workflows/ci.yml +0 -0
- {exhash-0.3.2 → exhash-0.3.3}/.gitignore +0 -0
- {exhash-0.3.2 → exhash-0.3.3}/_config.yml +0 -0
- {exhash-0.3.2 → exhash-0.3.3}/_layouts/default.html +0 -0
- {exhash-0.3.2 → exhash-0.3.3}/python/exhash.data/scripts/.gitkeep +0 -0
- {exhash-0.3.2 → exhash-0.3.3}/src/bin/lnhashview.rs +0 -0
- {exhash-0.3.2 → exhash-0.3.3}/tools/build.sh +0 -0
- {exhash-0.3.2 → exhash-0.3.3}/tools/bump.sh +0 -0
- {exhash-0.3.2 → exhash-0.3.3}/tools/bump2.sh +0 -0
- {exhash-0.3.2 → exhash-0.3.3}/tools/release.sh +0 -0
- {exhash-0.3.2 → exhash-0.3.3}/tools/test.sh +0 -0
|
@@ -25,7 +25,7 @@ checksum = "9330f8b2ff13f34540b44e946ef35111825727b38d33286ef986142615121801"
|
|
|
25
25
|
|
|
26
26
|
[[package]]
|
|
27
27
|
name = "exhash"
|
|
28
|
-
version = "0.3.
|
|
28
|
+
version = "0.3.3"
|
|
29
29
|
dependencies = [
|
|
30
30
|
"pyo3",
|
|
31
31
|
"regex",
|
|
@@ -48,9 +48,9 @@ dependencies = [
|
|
|
48
48
|
|
|
49
49
|
[[package]]
|
|
50
50
|
name = "libc"
|
|
51
|
-
version = "0.2.
|
|
51
|
+
version = "0.2.186"
|
|
52
52
|
source = "registry+https://github.com/rust-lang/crates.io-index"
|
|
53
|
-
checksum = "
|
|
53
|
+
checksum = "68ab91017fe16c622486840e4c83c9a37afeff978bd239b5293d61ece587de66"
|
|
54
54
|
|
|
55
55
|
[[package]]
|
|
56
56
|
name = "memchr"
|
|
@@ -18,7 +18,8 @@ src/
|
|
|
18
18
|
bin/exhash.rs CLI editor (atomic in-place edit, dry-run, stdin mode)
|
|
19
19
|
bin/lnhashview.rs CLI viewer
|
|
20
20
|
python/exhash/
|
|
21
|
-
__init__.py Python wrapper functions
|
|
21
|
+
__init__.py Python wrapper functions plus file-aware exhash_file orchestration
|
|
22
|
+
skill.py pyskills entry point exposing exhash APIs for LLM tools
|
|
22
23
|
python/exhash.data/scripts/
|
|
23
24
|
exhash native binary (built, not checked in)
|
|
24
25
|
lnhashview native binary (built, not checked in)
|
|
@@ -50,6 +51,9 @@ cargo test && pytest -q
|
|
|
50
51
|
`edit_text` verifies lnhashes command-by-command against the current in-memory buffer, immediately before each command executes (not all upfront). If an earlier command shifts or rewrites a later target line, that later command will fail with a stale-hash error unless you recompute addresses.
|
|
51
52
|
The `$` (last line) and `%` (whole file) address forms are resolved against the current buffer and do not require hashes.
|
|
52
53
|
`edit_text_with_sw` exposes configurable shift width for `<` and `>`; `edit_text` defaults to `sw=4`.
|
|
54
|
+
In CLI and Python file-helper flows, a missing file is treated as empty input only when the parsed command set is valid against an empty buffer (for example `0|0000|a`); otherwise the original file-not-found error is preserved.
|
|
55
|
+
Python `exhash_file` adds the file-qualified orchestration layer. It parses optional `path:` prefixes, applies each command to the current in-memory buffer for that file, rejects cross-file source ranges, and writes changed files only after every command succeeds.
|
|
56
|
+
`lnhashview` range requests clamp `end` past EOF to the last available line, while invalid `start` values still error.
|
|
53
57
|
|
|
54
58
|
## Release
|
|
55
59
|
|
|
@@ -86,9 +90,11 @@ Maturin's `data` option in `pyproject.toml` points to `python/exhash.data/`. Fil
|
|
|
86
90
|
|
|
87
91
|
The Rust core has three parsing functions:
|
|
88
92
|
|
|
89
|
-
- `parse_commands_from_strs(&[&str])` — for the Python API; each string is one command, text blocks
|
|
93
|
+
- `parse_commands_from_strs(&[&str])` — for the Python API; each string is one command, and multiline `a/i/c` text blocks must be in that same string using newlines, e.g. `["12|abcd|c\nnew line 1\nnew line 2"]`. Do not use `.` terminators or split the inserted text into separate command entries; a trailing `.` line is literal text and the Python binding warns about this common mistake.
|
|
90
94
|
- `parse_commands_from_script(&str)` — for script strings; commands separated by newlines, text blocks terminated by `.`
|
|
91
95
|
- `parse_commands_from_args(&[String], &mut BufRead)` — for the CLI; each arg is a command, text blocks read from stdin terminated by `.`
|
|
92
96
|
|
|
97
|
+
File-qualified addresses are parsed by the Python `exhash_file` wrapper; the Rust parser and CLI remain single-buffer.
|
|
98
|
+
|
|
93
99
|
Substitute parsing keeps Rust regex escapes intact (`\d`, `\w`, etc.) while still allowing escaped command delimiters (`\/`) in pattern and replacement.
|
|
94
100
|
Transliteration uses `y/src/dst/` and validates equal character counts at parse time.
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.4
|
|
2
2
|
Name: exhash
|
|
3
|
-
Version: 0.3.
|
|
3
|
+
Version: 0.3.3
|
|
4
4
|
Classifier: Programming Language :: Rust
|
|
5
5
|
Classifier: Programming Language :: Python :: Implementation :: CPython
|
|
6
6
|
Summary: Verified line-addressed file editor using lnhash addresses
|
|
@@ -52,6 +52,8 @@ lnhashview path/to/file.txt
|
|
|
52
52
|
lnhashview path/to/file.txt 10 20
|
|
53
53
|
```
|
|
54
54
|
|
|
55
|
+
If `end` is past EOF, `lnhashview` returns through the last available line instead of failing.
|
|
56
|
+
|
|
55
57
|
### Edit
|
|
56
58
|
|
|
57
59
|
```bash
|
|
@@ -80,6 +82,12 @@ exhash file.txt '%j'
|
|
|
80
82
|
|
|
81
83
|
# Move a line to EOF using $ as the destination
|
|
82
84
|
exhash file.txt '12|abcd|m$'
|
|
85
|
+
|
|
86
|
+
# Create a missing file by treating it as empty input
|
|
87
|
+
exhash new.txt '0|0000|a' <<'EOF'
|
|
88
|
+
first line
|
|
89
|
+
.
|
|
90
|
+
EOF
|
|
83
91
|
```
|
|
84
92
|
|
|
85
93
|
Substitute uses Rust regex syntax:
|
|
@@ -99,6 +107,8 @@ For `a/i/c` commands, provide the text block on stdin:
|
|
|
99
107
|
printf "new line 1\nnew line 2\n.\n" | exhash file.txt "2|beef|a"
|
|
100
108
|
```
|
|
101
109
|
|
|
110
|
+
If the file does not exist and the command set is valid on empty input, exhash treats it as an empty file and writes the result. For example, `0|0000|a` can create a new file.
|
|
111
|
+
|
|
102
112
|
### Stdin filter mode
|
|
103
113
|
|
|
104
114
|
```bash
|
|
@@ -117,13 +127,13 @@ from exhash import exhash, exhash_file, lnhash, lnhashview, lnhashview_file, lin
|
|
|
117
127
|
|
|
118
128
|
```py
|
|
119
129
|
text = "foo\nbar\n"
|
|
120
|
-
view = lnhashview(text)
|
|
121
|
-
view = lnhashview_file("f.py") #
|
|
130
|
+
view = lnhashview(text) # ["1|a1b2| foo", "2|c3d4| bar"]
|
|
131
|
+
view = lnhashview_file("f.py", start=1, end=260) # end past EOF is clamped
|
|
122
132
|
```
|
|
123
133
|
|
|
124
134
|
### Editing
|
|
125
135
|
|
|
126
|
-
`exhash(text, cmds, sw=4)` takes the text and a required iterable of command strings (use `[]` for no-op). `sw` controls how far `<` and `>` shift. For `a`/`i`/`c` commands,
|
|
136
|
+
`exhash(text, cmds, sw=4)` takes the text and a required iterable of command strings (use `[]` for no-op). `sw` controls how far `<` and `>` shift. For multiline `a`/`i`/`c` commands, include the inserted text in the same command string using newline characters, e.g. `["12|abcd|c\nnew line 1\nnew line 2"]`. Do not use `.` terminators, and do not split the text block into separate `cmds` entries. If you include a final `.` line, it is inserted literally and exhash emits a warning.
|
|
127
137
|
|
|
128
138
|
```py
|
|
129
139
|
addr = lnhash(1, "foo") # "1|a1b2|"
|
|
@@ -138,12 +148,15 @@ res = exhash(text, [f"{a1}s/foo/FOO/", f"{a2}s/bar/BAR/"])
|
|
|
138
148
|
# Hashes are checked just-in-time per command.
|
|
139
149
|
# If earlier commands change/shift a later target line, recompute lnhash first.
|
|
140
150
|
|
|
141
|
-
# Append multiline text (no dot terminator)
|
|
151
|
+
# Append multiline text in the same command string (no dot terminator)
|
|
142
152
|
res = exhash(text, [f"{addr}a\nnew line 1\nnew line 2"])
|
|
143
153
|
|
|
144
154
|
# Wrong for the Python API: the trailing "." would be inserted literally
|
|
145
155
|
# res = exhash(text, [f"{addr}a\nnew line 1\nnew line 2\n."])
|
|
146
156
|
|
|
157
|
+
# Also wrong: do not split the inserted text into separate cmds entries
|
|
158
|
+
# res = exhash(text, [f"{addr}a", "new line 1", "new line 2"])
|
|
159
|
+
|
|
147
160
|
# Change shift width for < and >
|
|
148
161
|
res = exhash(text, [f"{addr}>1"], sw=2)
|
|
149
162
|
|
|
@@ -157,18 +170,46 @@ res = exhash("foo\nbar\n", [f"{a1},{a2}s/foo\nbar/replaced/"])
|
|
|
157
170
|
|
|
158
171
|
### File helpers
|
|
159
172
|
|
|
160
|
-
`exhash_file`
|
|
173
|
+
`lnhashview_file` reads directly from one file path. `exhash_file(path, cmds, sw=4, inplace=False)` uses `path` as the default file context for unqualified addresses, and also accepts file-qualified source and `m`/`t` destination addresses:
|
|
161
174
|
|
|
162
175
|
```py
|
|
163
176
|
view = lnhashview_file("file.py")
|
|
164
177
|
|
|
165
|
-
# Returns
|
|
178
|
+
# Returns FileSetEditResult, files unchanged
|
|
166
179
|
res = exhash_file("file.py", [f"{addr}s/foo/bar/"])
|
|
180
|
+
print(res.changed) # ["file.py"]
|
|
181
|
+
print(res["file.py"].lines)
|
|
182
|
+
print(res.format_diff()) # includes --- file.py / +++ file.py headers
|
|
167
183
|
|
|
168
|
-
# With inplace=True, writes
|
|
184
|
+
# With inplace=True, writes changed files after every command succeeds
|
|
185
|
+
# and returns the combined diff string.
|
|
169
186
|
diff = exhash_file("file.py", [f"{addr}s/foo/bar/"], inplace=True)
|
|
187
|
+
|
|
188
|
+
# Missing files are treated as empty only when the command is valid on empty input.
|
|
189
|
+
diff = exhash_file("new.py", ["0|0000|a\nprint('hi')"], inplace=True)
|
|
190
|
+
|
|
191
|
+
# File-qualified addresses can edit or transfer lines across files.
|
|
192
|
+
cmds = [
|
|
193
|
+
"src/a.py:24|8f12|,38|c0de|m src/b.py:$",
|
|
194
|
+
r"src/a.py:5|91aa|s/from \.b import old/from \.b import helper/",
|
|
195
|
+
]
|
|
196
|
+
diff = exhash_file("src/a.py", cmds, inplace=True)
|
|
170
197
|
```
|
|
171
198
|
|
|
199
|
+
A file prefix is separated from the address with `:`. Escape literal colons in filenames as `\:` and literal backslashes as `\\`.
|
|
200
|
+
|
|
201
|
+
`exhash_file(..., inplace=False)` returns a `FileSetEditResult`:
|
|
202
|
+
|
|
203
|
+
- `res.files` — dict of path to `FileEditResult`
|
|
204
|
+
- `res.changed` — changed paths, in first-touch order
|
|
205
|
+
- `res.default_path` — the default path passed to `exhash_file`
|
|
206
|
+
- `res[path]` — shorthand for `res.files[path]`
|
|
207
|
+
- `res.format_diff(context=1)` — combined diff with `--- path` / `+++ path` headers
|
|
208
|
+
|
|
209
|
+
### Pyskill
|
|
210
|
+
|
|
211
|
+
The package registers `exhash.skill` as a pyskill exposing the primary Python APIs with LLM-oriented workflow docs. Use `doc(exhash.skill)` after importing it through a pyskills host.
|
|
212
|
+
|
|
172
213
|
### EditResult
|
|
173
214
|
|
|
174
215
|
`exhash()` returns an `EditResult` with attributes (also accessible via `res["key"]`):
|
|
@@ -184,6 +225,8 @@ diff = exhash_file("file.py", [f"{addr}s/foo/bar/"], inplace=True)
|
|
|
184
225
|
```py
|
|
185
226
|
res = exhash(text, [f"{addr}s/foo/baz/"])
|
|
186
227
|
print(res.format_diff())
|
|
228
|
+
# --- original
|
|
229
|
+
# +++ modified
|
|
187
230
|
# -1|a1b2| foo
|
|
188
231
|
# +1|c3d4| baz
|
|
189
232
|
# 2|e5f6| bar
|
|
@@ -37,6 +37,8 @@ lnhashview path/to/file.txt
|
|
|
37
37
|
lnhashview path/to/file.txt 10 20
|
|
38
38
|
```
|
|
39
39
|
|
|
40
|
+
If `end` is past EOF, `lnhashview` returns through the last available line instead of failing.
|
|
41
|
+
|
|
40
42
|
### Edit
|
|
41
43
|
|
|
42
44
|
```bash
|
|
@@ -65,6 +67,12 @@ exhash file.txt '%j'
|
|
|
65
67
|
|
|
66
68
|
# Move a line to EOF using $ as the destination
|
|
67
69
|
exhash file.txt '12|abcd|m$'
|
|
70
|
+
|
|
71
|
+
# Create a missing file by treating it as empty input
|
|
72
|
+
exhash new.txt '0|0000|a' <<'EOF'
|
|
73
|
+
first line
|
|
74
|
+
.
|
|
75
|
+
EOF
|
|
68
76
|
```
|
|
69
77
|
|
|
70
78
|
Substitute uses Rust regex syntax:
|
|
@@ -84,6 +92,8 @@ For `a/i/c` commands, provide the text block on stdin:
|
|
|
84
92
|
printf "new line 1\nnew line 2\n.\n" | exhash file.txt "2|beef|a"
|
|
85
93
|
```
|
|
86
94
|
|
|
95
|
+
If the file does not exist and the command set is valid on empty input, exhash treats it as an empty file and writes the result. For example, `0|0000|a` can create a new file.
|
|
96
|
+
|
|
87
97
|
### Stdin filter mode
|
|
88
98
|
|
|
89
99
|
```bash
|
|
@@ -102,13 +112,13 @@ from exhash import exhash, exhash_file, lnhash, lnhashview, lnhashview_file, lin
|
|
|
102
112
|
|
|
103
113
|
```py
|
|
104
114
|
text = "foo\nbar\n"
|
|
105
|
-
view = lnhashview(text)
|
|
106
|
-
view = lnhashview_file("f.py") #
|
|
115
|
+
view = lnhashview(text) # ["1|a1b2| foo", "2|c3d4| bar"]
|
|
116
|
+
view = lnhashview_file("f.py", start=1, end=260) # end past EOF is clamped
|
|
107
117
|
```
|
|
108
118
|
|
|
109
119
|
### Editing
|
|
110
120
|
|
|
111
|
-
`exhash(text, cmds, sw=4)` takes the text and a required iterable of command strings (use `[]` for no-op). `sw` controls how far `<` and `>` shift. For `a`/`i`/`c` commands,
|
|
121
|
+
`exhash(text, cmds, sw=4)` takes the text and a required iterable of command strings (use `[]` for no-op). `sw` controls how far `<` and `>` shift. For multiline `a`/`i`/`c` commands, include the inserted text in the same command string using newline characters, e.g. `["12|abcd|c\nnew line 1\nnew line 2"]`. Do not use `.` terminators, and do not split the text block into separate `cmds` entries. If you include a final `.` line, it is inserted literally and exhash emits a warning.
|
|
112
122
|
|
|
113
123
|
```py
|
|
114
124
|
addr = lnhash(1, "foo") # "1|a1b2|"
|
|
@@ -123,12 +133,15 @@ res = exhash(text, [f"{a1}s/foo/FOO/", f"{a2}s/bar/BAR/"])
|
|
|
123
133
|
# Hashes are checked just-in-time per command.
|
|
124
134
|
# If earlier commands change/shift a later target line, recompute lnhash first.
|
|
125
135
|
|
|
126
|
-
# Append multiline text (no dot terminator)
|
|
136
|
+
# Append multiline text in the same command string (no dot terminator)
|
|
127
137
|
res = exhash(text, [f"{addr}a\nnew line 1\nnew line 2"])
|
|
128
138
|
|
|
129
139
|
# Wrong for the Python API: the trailing "." would be inserted literally
|
|
130
140
|
# res = exhash(text, [f"{addr}a\nnew line 1\nnew line 2\n."])
|
|
131
141
|
|
|
142
|
+
# Also wrong: do not split the inserted text into separate cmds entries
|
|
143
|
+
# res = exhash(text, [f"{addr}a", "new line 1", "new line 2"])
|
|
144
|
+
|
|
132
145
|
# Change shift width for < and >
|
|
133
146
|
res = exhash(text, [f"{addr}>1"], sw=2)
|
|
134
147
|
|
|
@@ -142,18 +155,46 @@ res = exhash("foo\nbar\n", [f"{a1},{a2}s/foo\nbar/replaced/"])
|
|
|
142
155
|
|
|
143
156
|
### File helpers
|
|
144
157
|
|
|
145
|
-
`exhash_file`
|
|
158
|
+
`lnhashview_file` reads directly from one file path. `exhash_file(path, cmds, sw=4, inplace=False)` uses `path` as the default file context for unqualified addresses, and also accepts file-qualified source and `m`/`t` destination addresses:
|
|
146
159
|
|
|
147
160
|
```py
|
|
148
161
|
view = lnhashview_file("file.py")
|
|
149
162
|
|
|
150
|
-
# Returns
|
|
163
|
+
# Returns FileSetEditResult, files unchanged
|
|
151
164
|
res = exhash_file("file.py", [f"{addr}s/foo/bar/"])
|
|
165
|
+
print(res.changed) # ["file.py"]
|
|
166
|
+
print(res["file.py"].lines)
|
|
167
|
+
print(res.format_diff()) # includes --- file.py / +++ file.py headers
|
|
152
168
|
|
|
153
|
-
# With inplace=True, writes
|
|
169
|
+
# With inplace=True, writes changed files after every command succeeds
|
|
170
|
+
# and returns the combined diff string.
|
|
154
171
|
diff = exhash_file("file.py", [f"{addr}s/foo/bar/"], inplace=True)
|
|
172
|
+
|
|
173
|
+
# Missing files are treated as empty only when the command is valid on empty input.
|
|
174
|
+
diff = exhash_file("new.py", ["0|0000|a\nprint('hi')"], inplace=True)
|
|
175
|
+
|
|
176
|
+
# File-qualified addresses can edit or transfer lines across files.
|
|
177
|
+
cmds = [
|
|
178
|
+
"src/a.py:24|8f12|,38|c0de|m src/b.py:$",
|
|
179
|
+
r"src/a.py:5|91aa|s/from \.b import old/from \.b import helper/",
|
|
180
|
+
]
|
|
181
|
+
diff = exhash_file("src/a.py", cmds, inplace=True)
|
|
155
182
|
```
|
|
156
183
|
|
|
184
|
+
A file prefix is separated from the address with `:`. Escape literal colons in filenames as `\:` and literal backslashes as `\\`.
|
|
185
|
+
|
|
186
|
+
`exhash_file(..., inplace=False)` returns a `FileSetEditResult`:
|
|
187
|
+
|
|
188
|
+
- `res.files` — dict of path to `FileEditResult`
|
|
189
|
+
- `res.changed` — changed paths, in first-touch order
|
|
190
|
+
- `res.default_path` — the default path passed to `exhash_file`
|
|
191
|
+
- `res[path]` — shorthand for `res.files[path]`
|
|
192
|
+
- `res.format_diff(context=1)` — combined diff with `--- path` / `+++ path` headers
|
|
193
|
+
|
|
194
|
+
### Pyskill
|
|
195
|
+
|
|
196
|
+
The package registers `exhash.skill` as a pyskill exposing the primary Python APIs with LLM-oriented workflow docs. Use `doc(exhash.skill)` after importing it through a pyskills host.
|
|
197
|
+
|
|
157
198
|
### EditResult
|
|
158
199
|
|
|
159
200
|
`exhash()` returns an `EditResult` with attributes (also accessible via `res["key"]`):
|
|
@@ -169,6 +210,8 @@ diff = exhash_file("file.py", [f"{addr}s/foo/bar/"], inplace=True)
|
|
|
169
210
|
```py
|
|
170
211
|
res = exhash(text, [f"{addr}s/foo/baz/"])
|
|
171
212
|
print(res.format_diff())
|
|
213
|
+
# --- original
|
|
214
|
+
# +++ modified
|
|
172
215
|
# -1|a1b2| foo
|
|
173
216
|
# +1|c3d4| baz
|
|
174
217
|
# 2|e5f6| bar
|
|
@@ -4,7 +4,7 @@ build-backend = "maturin"
|
|
|
4
4
|
|
|
5
5
|
[project]
|
|
6
6
|
name = "exhash"
|
|
7
|
-
version = "0.3.
|
|
7
|
+
version = "0.3.3"
|
|
8
8
|
description = "Verified line-addressed file editor using lnhash addresses"
|
|
9
9
|
license = {text = "MIT OR Apache-2.0"}
|
|
10
10
|
requires-python = ">=3.10"
|
|
@@ -19,6 +19,8 @@ classifiers = [
|
|
|
19
19
|
Homepage = "https://github.com/AnswerDotAI/exhash"
|
|
20
20
|
Repository = "https://github.com/AnswerDotAI/exhash"
|
|
21
21
|
Issues = "https://github.com/AnswerDotAI/exhash/issues"
|
|
22
|
+
[project.entry-points.pyskills]
|
|
23
|
+
exhash = "exhash.skill"
|
|
22
24
|
|
|
23
25
|
[tool.maturin]
|
|
24
26
|
features = ["extension-module"]
|
|
@@ -0,0 +1,337 @@
|
|
|
1
|
+
import re
|
|
2
|
+
from difflib import SequenceMatcher
|
|
3
|
+
from pathlib import Path
|
|
4
|
+
from .exhash import line_hash as _line_hash, lnhash as _lnhash, lnhashview as _lnhashview, exhash as _exhash
|
|
5
|
+
|
|
6
|
+
def line_hash(line:str) -> str:
|
|
7
|
+
'Return a 4-char lowercase hex hash for a single line of text.'
|
|
8
|
+
return _line_hash(line)
|
|
9
|
+
|
|
10
|
+
|
|
11
|
+
def lnhash(lineno:int, line:str) -> str:
|
|
12
|
+
'Return an lnhash address ``lineno|hash|`` for ``line`` at 1-based ``lineno``.'
|
|
13
|
+
return _lnhash(lineno, line)
|
|
14
|
+
|
|
15
|
+
|
|
16
|
+
def lnhashview(text:str, start:int=None, end:int=None) -> list[str]:
|
|
17
|
+
'Return lines formatted as ``lineno|hash| content``. Optional 1-based ``start``/``end`` filter the range; ``end`` past EOF is clamped.'
|
|
18
|
+
return _lnhashview(text, start, end)
|
|
19
|
+
|
|
20
|
+
|
|
21
|
+
def lnhashview_file(path:str, start:int=None, end:int=None) -> list[str]:
|
|
22
|
+
'Return lines formatted as ``lineno|hash| content`` for file at ``path``. Optional 1-based ``start``/``end`` filter the range; ``end`` past EOF is clamped.'
|
|
23
|
+
return _lnhashview(Path(path).read_text(), start, end)
|
|
24
|
+
|
|
25
|
+
|
|
26
|
+
def exhash(text:str, cmds:list[str], sw:int=4):
|
|
27
|
+
"""Verified line-addressed editor. Apply commands to `text`, return an EditResult.
|
|
28
|
+
|
|
29
|
+
Commands primarily use lnhash addresses: ``lineno|hash|cmd`` where hash is
|
|
30
|
+
a 4-char hex content hash. Use ``lnhashview(text)`` or
|
|
31
|
+
``lnhash(lineno, line)`` to get hashed addresses.
|
|
32
|
+
Each command's hashes are verified against current text immediately before
|
|
33
|
+
that command executes.
|
|
34
|
+
|
|
35
|
+
Addressing:
|
|
36
|
+
Single: ``12|a3f2|cmd``
|
|
37
|
+
Range: ``12|a3f2|,15|b1c3|cmd``
|
|
38
|
+
Last: ``$cmd`` (last line)
|
|
39
|
+
Whole: ``%cmd`` (whole file, same as ``1,$``)
|
|
40
|
+
Special: ``0|0000|`` targets before line 1 (only with a or i)
|
|
41
|
+
|
|
42
|
+
Commands:
|
|
43
|
+
s/pat/rep/[flags] Substitute using Rust regex syntax.
|
|
44
|
+
Replacement supports $1, $0, ${name}. Flags: g=all, i=case-insensitive
|
|
45
|
+
Any non-alphanumeric delimiter works: s@pat@rep@, s|pat|rep|g
|
|
46
|
+
Literal newlines in pat/rep are supported (joins/splits lines)
|
|
47
|
+
y/src/dst/ Transliterate chars in-place (also supports custom delimiters;
|
|
48
|
+
source and destination lengths must match)
|
|
49
|
+
d Delete line(s)
|
|
50
|
+
a Append text after line
|
|
51
|
+
i Insert text before line
|
|
52
|
+
c Change/replace line(s)
|
|
53
|
+
j Join with next line; with range, joins all
|
|
54
|
+
m dest Move line(s) after dest address
|
|
55
|
+
t dest Copy line(s) after dest address
|
|
56
|
+
>[n] Indent n levels (default 1, `sw` spaces each)
|
|
57
|
+
<[n] Dedent n levels (default 1, `sw` spaces each)
|
|
58
|
+
sort Sort lines alphabetically
|
|
59
|
+
p Print (include in output without changing)
|
|
60
|
+
g/pat/cmd Global: run cmd on matching lines (custom delimiters ok: g@pat@cmd)
|
|
61
|
+
g!/pat/cmd Inverted global (also v/pat/cmd; custom delimiters ok)
|
|
62
|
+
|
|
63
|
+
`sw` controls shift width for `<` and `>` and defaults to 4.
|
|
64
|
+
|
|
65
|
+
For multiline a/i/c commands, include the inserted text in the same command
|
|
66
|
+
string using newline characters, e.g. ``["12|abcd|c\nnew line 1\nnew line 2"]``.
|
|
67
|
+
Do not use ``.`` terminators, and do not split the text block into separate
|
|
68
|
+
``cmds`` entries. If you include a final ``.`` line, it is inserted literally
|
|
69
|
+
and exhash emits a warning.
|
|
70
|
+
|
|
71
|
+
Returns an EditResult with attributes (also accessible as dict keys):
|
|
72
|
+
lines list of output lines
|
|
73
|
+
hashes lnhash for each output line
|
|
74
|
+
modified 1-based line numbers of modified/added lines
|
|
75
|
+
deleted 1-based line numbers of removed lines (in original)
|
|
76
|
+
origins for each output line, the 1-based original line number (None if inserted)
|
|
77
|
+
|
|
78
|
+
Call ``res.format_diff(context=1)`` for a unified-diff-style summary.
|
|
79
|
+
Non-empty diffs start with ``--- original`` and ``+++ modified`` headers.
|
|
80
|
+
|
|
81
|
+
Examples::
|
|
82
|
+
|
|
83
|
+
from exhash import exhash, lnhash, lnhashview
|
|
84
|
+
text = "foo\\nbar\\n"
|
|
85
|
+
addr = lnhash(1, "foo") # "1|a1b2|"
|
|
86
|
+
res = exhash(text, [f"{addr}s/foo/baz/"])
|
|
87
|
+
print(res["lines"]) # ["baz", "bar"]
|
|
88
|
+
print(res.format_diff()) # unified-diff-style summary
|
|
89
|
+
"""
|
|
90
|
+
return _exhash(text, *cmds, sw=sw)
|
|
91
|
+
|
|
92
|
+
|
|
93
|
+
class FileEditResult:
|
|
94
|
+
'Edited state for one file.'
|
|
95
|
+
def __init__(self, path, original_lines, lines):
|
|
96
|
+
self.path = _norm_path(path)
|
|
97
|
+
self.original_lines = list(original_lines)
|
|
98
|
+
self.lines = list(lines)
|
|
99
|
+
self.hashes = [lnhash(i + 1, line) for i, line in enumerate(self.lines)]
|
|
100
|
+
|
|
101
|
+
@property
|
|
102
|
+
def changed(self): return self.original_lines != self.lines
|
|
103
|
+
|
|
104
|
+
def __getitem__(self, key):
|
|
105
|
+
if key in {"lines", "hashes", "original_lines"}: return getattr(self, key)
|
|
106
|
+
raise KeyError(key)
|
|
107
|
+
|
|
108
|
+
def format_diff(self, context=1): return _format_file_diff(self.path, self.original_lines, self.lines, context)
|
|
109
|
+
|
|
110
|
+
|
|
111
|
+
class FileSetEditResult:
|
|
112
|
+
'Edited state for an exhash_file command set.'
|
|
113
|
+
def __init__(self, files, default_path):
|
|
114
|
+
self.files = files
|
|
115
|
+
self.default_path = default_path
|
|
116
|
+
self.changed = [path for path, result in files.items() if result.changed]
|
|
117
|
+
|
|
118
|
+
def __getitem__(self, path): return self.files[_norm_path(path)]
|
|
119
|
+
|
|
120
|
+
def format_diff(self, context=1): return ''.join(self.files[path].format_diff(context) for path in self.changed)
|
|
121
|
+
|
|
122
|
+
|
|
123
|
+
_ADDR_RE = re.compile(r'(?:\$|%|\d+\|[0-9a-fA-F]{4}\|)')
|
|
124
|
+
_LNHASH_RE = re.compile(r'(\d+)\|([0-9a-fA-F]{4})\|')
|
|
125
|
+
|
|
126
|
+
|
|
127
|
+
def _norm_path(path): return str(Path(path))
|
|
128
|
+
|
|
129
|
+
|
|
130
|
+
def _text_from_lines(lines): return '\n'.join(lines) + ('\n' if lines else '')
|
|
131
|
+
|
|
132
|
+
|
|
133
|
+
def _write_lines(path, lines): Path(path).write_text(_text_from_lines(lines))
|
|
134
|
+
|
|
135
|
+
|
|
136
|
+
def _format_file_diff(path, old_lines, new_lines, context=1):
|
|
137
|
+
if old_lines == new_lines: return ''
|
|
138
|
+
events = []
|
|
139
|
+
for tag, i1, i2, j1, j2 in SequenceMatcher(a=old_lines, b=new_lines, autojunk=False).get_opcodes():
|
|
140
|
+
if tag == 'equal': events += [(' ', j + 1, new_lines[j]) for j in range(j1, j2)]
|
|
141
|
+
elif tag == 'delete': events += [('-', i + 1, old_lines[i]) for i in range(i1, i2)]
|
|
142
|
+
elif tag == 'insert': events += [('+', j + 1, new_lines[j]) for j in range(j1, j2)]
|
|
143
|
+
elif tag == 'replace':
|
|
144
|
+
events += [('-', i + 1, old_lines[i]) for i in range(i1, i2)]
|
|
145
|
+
events += [('+', j + 1, new_lines[j]) for j in range(j1, j2)]
|
|
146
|
+
interesting = set()
|
|
147
|
+
for i, (tag, _, _) in enumerate(events):
|
|
148
|
+
if tag != ' ': interesting.update(range(max(0, i - context), min(len(events), i + context + 1)))
|
|
149
|
+
out, last = [f'--- {path}', f'+++ {path}'], None
|
|
150
|
+
for i in sorted(interesting):
|
|
151
|
+
if last is not None and i > last + 1: out.append('---')
|
|
152
|
+
tag, lineno, line = events[i]
|
|
153
|
+
out.append(f'{tag}{lnhash(lineno, line)} {line}')
|
|
154
|
+
last = i
|
|
155
|
+
return '\n'.join(out) + '\n'
|
|
156
|
+
|
|
157
|
+
|
|
158
|
+
def _unescape_path(path):
|
|
159
|
+
out, escaped = [], False
|
|
160
|
+
for ch in path:
|
|
161
|
+
if escaped:
|
|
162
|
+
if ch not in ':\\': out.append('\\')
|
|
163
|
+
out.append(ch)
|
|
164
|
+
escaped = False
|
|
165
|
+
elif ch == '\\': escaped = True
|
|
166
|
+
else: out.append(ch)
|
|
167
|
+
if escaped: out.append('\\')
|
|
168
|
+
return ''.join(out)
|
|
169
|
+
|
|
170
|
+
|
|
171
|
+
def _split_file_prefix(s):
|
|
172
|
+
if _ADDR_RE.match(s): return None, s
|
|
173
|
+
escaped = False
|
|
174
|
+
for i, ch in enumerate(s):
|
|
175
|
+
if escaped:
|
|
176
|
+
escaped = False
|
|
177
|
+
continue
|
|
178
|
+
if ch == '\\':
|
|
179
|
+
escaped = True
|
|
180
|
+
continue
|
|
181
|
+
if ch == ':' and _ADDR_RE.match(s[i + 1:]):
|
|
182
|
+
path = _unescape_path(s[:i])
|
|
183
|
+
if not path: raise ValueError('empty filename prefix')
|
|
184
|
+
return _norm_path(path), s[i + 1:]
|
|
185
|
+
return None, s
|
|
186
|
+
|
|
187
|
+
|
|
188
|
+
def _parse_fileaddr(s, default_path):
|
|
189
|
+
path, rest = _split_file_prefix(s)
|
|
190
|
+
path = path or default_path
|
|
191
|
+
m = _ADDR_RE.match(rest)
|
|
192
|
+
if not m: raise ValueError(f'expected exhash address near {s[:40]!r}')
|
|
193
|
+
return path, m.group(0), rest[m.end():]
|
|
194
|
+
|
|
195
|
+
|
|
196
|
+
def _parse_file_command(raw, default_path):
|
|
197
|
+
if not raw.strip(): return None
|
|
198
|
+
src, addr1, rest = _parse_fileaddr(raw.lstrip(), default_path)
|
|
199
|
+
has_comma, addr2, local = False, None, addr1
|
|
200
|
+
if rest.startswith(','):
|
|
201
|
+
has_comma = True
|
|
202
|
+
src2, addr2, rest = _parse_fileaddr(rest[1:], src)
|
|
203
|
+
if src2 != src: raise ValueError('cross-file ranges are invalid')
|
|
204
|
+
local += ',' + addr2
|
|
205
|
+
local += rest
|
|
206
|
+
body = rest.lstrip()
|
|
207
|
+
op = body[:1] if body[:1] in {'m', 't'} else None
|
|
208
|
+
dest = dest_addr = None
|
|
209
|
+
if op:
|
|
210
|
+
dest, dest_addr, tail = _parse_fileaddr(body[1:].strip(), src)
|
|
211
|
+
if tail.strip(): raise ValueError(f'unexpected trailing characters after destination: {tail!r}')
|
|
212
|
+
return dict(src=src, addr1=addr1, addr2=addr2, has_comma=has_comma, rest=rest, local=local, op=op, dest=dest, dest_addr=dest_addr)
|
|
213
|
+
|
|
214
|
+
|
|
215
|
+
def _load_buffer(buffers, path, missing_ok=False):
|
|
216
|
+
if path in buffers: return buffers[path]
|
|
217
|
+
p = Path(path)
|
|
218
|
+
try: lines = p.read_text().splitlines()
|
|
219
|
+
except FileNotFoundError:
|
|
220
|
+
if not missing_ok: raise
|
|
221
|
+
if not p.parent.exists(): raise
|
|
222
|
+
lines = []
|
|
223
|
+
buffers[path] = dict(path=path, original=list(lines), lines=list(lines))
|
|
224
|
+
return buffers[path]
|
|
225
|
+
|
|
226
|
+
|
|
227
|
+
def _can_create_missing(parsed): return parsed['addr1'] == '0|0000|' and parsed['rest'].lstrip()[:1] in {'a', 'i'}
|
|
228
|
+
|
|
229
|
+
|
|
230
|
+
def _split_lnhash_addr(addr):
|
|
231
|
+
m = _LNHASH_RE.fullmatch(addr)
|
|
232
|
+
if not m: raise ValueError(f'expected lnhash address, got {addr!r}')
|
|
233
|
+
return int(m.group(1)), m.group(2).lower()
|
|
234
|
+
|
|
235
|
+
|
|
236
|
+
def _line_no(lines, addr, allow_zero=False):
|
|
237
|
+
if addr == '$':
|
|
238
|
+
if not lines: raise ValueError("address '$' out of range on empty file")
|
|
239
|
+
return len(lines)
|
|
240
|
+
if addr == '%': raise ValueError('% is only allowed as a source range')
|
|
241
|
+
lineno, expected = _split_lnhash_addr(addr)
|
|
242
|
+
if lineno == 0:
|
|
243
|
+
if expected != '0000': raise ValueError('0|0000| must have hash 0000')
|
|
244
|
+
if allow_zero: return 0
|
|
245
|
+
raise ValueError('address 0 is not allowed here')
|
|
246
|
+
if lineno > len(lines): raise ValueError(f'address out of range: {lineno} > {len(lines)}')
|
|
247
|
+
actual = line_hash(lines[lineno - 1])
|
|
248
|
+
if actual != expected: raise ValueError(f'stale lnhash at line {lineno}: expected {expected}, got {actual}')
|
|
249
|
+
return lineno
|
|
250
|
+
|
|
251
|
+
|
|
252
|
+
def _source_indexes(lines, parsed):
|
|
253
|
+
if parsed['addr1'] == '%':
|
|
254
|
+
if parsed['has_comma'] or parsed['addr2'] is not None: raise ValueError('% is already a whole-file range')
|
|
255
|
+
return (0, len(lines) - 1) if lines else (0, -1)
|
|
256
|
+
start = _line_no(lines, parsed['addr1'])
|
|
257
|
+
end = _line_no(lines, parsed['addr2']) if parsed['addr2'] is not None else start
|
|
258
|
+
if start > end: raise ValueError(f'invalid range: {start}..{end}')
|
|
259
|
+
return start - 1, end - 1
|
|
260
|
+
|
|
261
|
+
|
|
262
|
+
def _dest_index(lines, addr):
|
|
263
|
+
if addr == '%': raise ValueError('destination % is not allowed')
|
|
264
|
+
return _line_no(lines, addr, allow_zero=True)
|
|
265
|
+
|
|
266
|
+
|
|
267
|
+
def _apply_transfer(buffers, parsed):
|
|
268
|
+
src = _load_buffer(buffers, parsed['src'])
|
|
269
|
+
dst = _load_buffer(buffers, parsed['dest'], missing_ok=parsed['dest_addr'] == '0|0000|')
|
|
270
|
+
s, e = _source_indexes(src['lines'], parsed)
|
|
271
|
+
dest = _dest_index(dst['lines'], parsed['dest_addr'])
|
|
272
|
+
segment = src['lines'][s:e + 1] if s <= e else []
|
|
273
|
+
if parsed['op'] == 't':
|
|
274
|
+
dst['lines'][dest:dest] = list(segment)
|
|
275
|
+
return
|
|
276
|
+
if src is dst:
|
|
277
|
+
if s <= e and s < dest <= e + 1: raise ValueError('destination is within moved range')
|
|
278
|
+
del src['lines'][s:e + 1]
|
|
279
|
+
insert_at = dest if dest <= s else dest - len(segment)
|
|
280
|
+
src['lines'][insert_at:insert_at] = segment
|
|
281
|
+
else:
|
|
282
|
+
del src['lines'][s:e + 1]
|
|
283
|
+
dst['lines'][dest:dest] = segment
|
|
284
|
+
|
|
285
|
+
|
|
286
|
+
def _apply_file_command(buffers, parsed, sw):
|
|
287
|
+
if parsed['op']:
|
|
288
|
+
_apply_transfer(buffers, parsed)
|
|
289
|
+
return
|
|
290
|
+
buf = _load_buffer(buffers, parsed['src'], missing_ok=_can_create_missing(parsed))
|
|
291
|
+
res = exhash(_text_from_lines(buf['lines']), [parsed['local']], sw=sw)
|
|
292
|
+
buf['lines'] = list(res['lines'])
|
|
293
|
+
|
|
294
|
+
|
|
295
|
+
def exhash_file(path:str, cmds:list[str], sw:int=4, inplace:bool=False):
|
|
296
|
+
r'''Read files, apply file-aware exhash commands, and return per-file results or a combined diff.
|
|
297
|
+
|
|
298
|
+
Core command syntax is the same as ``exhash(text, cmds, sw=sw)``; run
|
|
299
|
+
``doc(exhash)`` for the full command reference. Use ``path`` as the default
|
|
300
|
+
file context for unqualified addresses. Prefix any source address, and any
|
|
301
|
+
``m``/``t`` destination, with ``path:`` to target another file::
|
|
302
|
+
|
|
303
|
+
src/a.py:12|a3f2|s/foo/bar/
|
|
304
|
+
src/a.py:10|aaaa|,20|bbbb|m src/b.py:$
|
|
305
|
+
src/a.py:10|aaaa|t new.py:0|0000|
|
|
306
|
+
|
|
307
|
+
A range must stay within one file. The second address may omit the filename
|
|
308
|
+
and inherit it from the first address. Cross-file ranges are invalid. Escape
|
|
309
|
+
literal colons in filenames as ``\:`` and literal backslashes as ``\\\\``.
|
|
310
|
+
|
|
311
|
+
For multiline ``a``/``i``/``c`` commands, include the inserted text in the
|
|
312
|
+
same command string using newline characters. Do not use ``.`` terminators,
|
|
313
|
+
and do not split the text block into separate ``cmds`` entries.
|
|
314
|
+
|
|
315
|
+
Missing files are treated as empty only when the command is valid against an
|
|
316
|
+
empty buffer, such as ``0|0000|a``/``0|0000|i`` or an ``m``/``t`` destination
|
|
317
|
+
of ``0|0000|``.
|
|
318
|
+
|
|
319
|
+
With ``inplace=False``, return a ``FileSetEditResult`` with ``files``,
|
|
320
|
+
``changed``, ``default_path``, ``res[path]``, and
|
|
321
|
+
``res.format_diff(context=1)``. With ``inplace=True``, write changed files
|
|
322
|
+
only after every command succeeds and return the combined diff string. If
|
|
323
|
+
any command fails, write nothing.
|
|
324
|
+
'''
|
|
325
|
+
default_path, buffers = _norm_path(path), {}
|
|
326
|
+
for raw in cmds:
|
|
327
|
+
parsed = _parse_file_command(raw, default_path)
|
|
328
|
+
if parsed is not None: _apply_file_command(buffers, parsed, sw)
|
|
329
|
+
if not buffers: _load_buffer(buffers, default_path)
|
|
330
|
+
files = {path: FileEditResult(path, buf['original'], buf['lines']) for path, buf in buffers.items()}
|
|
331
|
+
result = FileSetEditResult(files, default_path)
|
|
332
|
+
if inplace:
|
|
333
|
+
for path in result.changed: _write_lines(path, result[path].lines)
|
|
334
|
+
return result.format_diff()
|
|
335
|
+
return result
|
|
336
|
+
|
|
337
|
+
|