reconcile-text 0.8.0 → 0.11.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,23 +1,24 @@
1
1
  # `reconcile-text`: conflict-free 3-way text merging
2
2
 
3
- A Rust and TypeScript library for merging conflicting text edits without manual intervention. Unlike traditional 3-way merge tools that produce conflict markers, `reconcile-text` automatically resolves conflicts by applying both sets of changes (while updating cursor positions) using an algorithm inspired by Operational Transformation.
3
+ A Rust, TypeScript, and Python library for merging conflicting text edits without manual intervention. Unlike traditional 3-way merge tools that produce conflict markers, `reconcile-text` automatically resolves conflicts by applying both sets of changes (while updating cursor positions) using an algorithm inspired by Operational Transformation.
4
4
 
5
5
  ## Try it
6
6
 
7
- ✨ **[Try the interactive demo](https://schmelczer.dev/reconcile)** to see it in action!
7
+ ✨ **[Try the interactive demo][8]** to see it in action!
8
8
 
9
9
  ### Install it in your project
10
10
 
11
- - `cargo add reconcile-text` ([reconcile-text on crates.io](https://crates.io/crates/reconcile-text))
12
- - `npm install reconcile-text` ([reconcile-text on NPM](https://www.npmjs.com/package/reconcile-text))
11
+ - `cargo add reconcile-text` ([reconcile-text on crates.io][9])
12
+ - `npm install reconcile-text` ([reconcile-text on NPM][10])
13
+ - `uv add reconcile-text` or `pip install reconcile-text` ([reconcile-text on PyPI][27])
13
14
 
14
15
  ## Key features
15
16
 
16
- - **No conflict markers** Clean, merged output without Git's `<<<<<<<` markers
17
- - **Cursor tracking** Automatically repositions cursors and selections throughout the merging process
18
- - **Flexible tokenisation** Word-level (default), character-level, line-level, or custom tokenisation strategies
19
- - **Unicode support** Full UTF-8 support with proper handling of complex scripts and grapheme clusters
20
- - **Cross-platform** Native Rust performance with WebAssembly bindings for JavaScript environments
17
+ - **No conflict markers** - Clean, merged output without Git's `<<<<<<<` markers
18
+ - **Cursor tracking** - Automatically repositions cursors and selections throughout the merging process
19
+ - **Flexible tokenisation** - Word-level (default), character-level, line-level, or custom tokenisation strategies
20
+ - **Unicode support** - Full UTF-8 support with proper handling of complex scripts and grapheme clusters
21
+ - **Cross-platform** - Native Rust performance with WebAssembly bindings for JavaScript and native bindings for Python
21
22
 
22
23
  ## Quick start
23
24
 
@@ -33,7 +34,7 @@ Alternatively, add `reconcile-text` to your `Cargo.toml`:
33
34
 
34
35
  ```toml
35
36
  [dependencies]
36
- reconcile-text = "0.5"
37
+ reconcile-text = "0.8"
37
38
  ```
38
39
 
39
40
  Then start merging:
@@ -52,7 +53,7 @@ let result = reconcile(parent, &left.into(), &right.into(), &*BuiltinTokenizer::
52
53
  assert_eq!(result.apply().text(), "Hi beautiful world");
53
54
  ```
54
55
 
55
- See the [merge-file example](examples/merge-file.rs) for another example or the [library's documentation](https://docs.rs/reconcile-text/latest/reconcile_text).
56
+ See the [merge-file example](examples/merge-file.rs) for another example, or the [library's documentation][11].
56
57
 
57
58
  ### JavaScript/TypeScript
58
59
 
@@ -77,7 +78,33 @@ const result = reconcile(parent, left, right);
77
78
  console.log(result.text); // "Hi beautiful world"
78
79
  ```
79
80
 
80
- See the [example website source](examples/website/src/index.ts) for a more complex example or the [advanced examples document](https://github.com/schmelczer/reconcile/blob/main/docs/advanced-ts.md).
81
+ See the [example website source](examples/website/src/index.ts) for a more complex example, or the [advanced examples document](docs/advanced-ts.md).
82
+
83
+ ### Python
84
+
85
+ Install via uv or pip:
86
+
87
+ ```sh
88
+ uv add reconcile-text
89
+ # or: pip install reconcile-text
90
+ ```
91
+
92
+ Then use it in your application:
93
+
94
+ ```python
95
+ from reconcile_text import reconcile
96
+
97
+ # Start with the original text
98
+ parent = "Hello world"
99
+ # Two users edit simultaneously
100
+ left = "Hello beautiful world"
101
+ right = "Hi world"
102
+
103
+ result = reconcile(parent, left, right)
104
+ print(result["text"]) # "Hi beautiful world"
105
+ ```
106
+
107
+ See the [merge-file example](examples/merge_file.py) for a file-merging CLI, or the [advanced examples document](docs/advanced-python.md) for cursor tracking, change provenance, and compact diffs.
81
108
 
82
109
  ## Motivation
83
110
 
@@ -87,30 +114,81 @@ This creates **Differential Synchronisation** scenarios ([2], [3]): we only know
87
114
 
88
115
  > **Note**: Some text domains require more careful handling. Legal contracts, for instance, could have unintended meaning changes from conflicting edits that create double negations. At the same time, semantic conflicts can still arise when merging code, even in the absence of syntactic conflicts.
89
116
 
90
- Differential sync is implemented by [universal-sync](https://github.com/invisible-college/universal-sync) and my Obsidian plugin [vault-link](https://github.com/schmelczer/vault-link), and it requires a merging tool which creates conflict-free results for the best user experience.
117
+ Differential sync is implemented by [universal-sync][12], and it requires a merging tool that creates conflict-free results for the best user experience.
91
118
 
92
119
  ## How it works
93
120
 
94
121
  `reconcile-text` starts off similarly to `diff3` ([4], [5]) but adds automated conflict resolution. Given a **parent** document and two modified versions (`left` and `right`), the following happens:
95
122
 
96
- 1. **Tokenisation** Input texts get split into meaningful units (words, characters, etc.) for granular merging
97
- 2. **Diff computation** Myers' algorithm calculates differences between (parent ↔ left) and (parent ↔ right)
98
- 3. **Diff optimisation** Operations are reordered and consolidated to maximise chained changes
99
- 4. **Operational Transformation** Edits are woven together using OT principles, preserving all modifications and updating cursors
123
+ 1. **Tokenisation** - Input texts are split into meaningful units (words, characters, etc.) for granular merging
124
+ 2. **Diff computation** - Myers' algorithm calculates differences between (parent ↔ left) and (parent ↔ right)
125
+ 3. **Diff optimisation** - Operations are reordered and consolidated to maximise chained changes
126
+ 4. **Operational Transformation** - Edits are woven together using OT principles, preserving all modifications and updating cursors
100
127
 
101
- Whilst the primary goal of `reconcile-text` isn't to implement OT, it provides an elegant way to merge Myers' diff outputs. (For a dedicated Rust OT implementation, see [operational-transform-rs](https://github.com/spebern/operational-transform-rs).) The same could be achieved with CRDTs, which many libraries implement well for textsee [Loro](https://github.com/loro-dev/loro/), [cola](https://github.com/nomad/cola), and [automerge](https://github.com/automerge/automerge) as excellent examples.
128
+ Whilst the primary goal of `reconcile-text` isn't to implement OT, it provides an elegant way to merge Myers' diff outputs. (For a dedicated Rust OT implementation, see [operational-transform-rs][13].) The same could be achieved with CRDTs, which many libraries implement well for text (see [Loro][14], [cola][15], and [automerge][16]).
102
129
 
103
130
  However, when only the end result of concurrent changes is observable, merge quality depends entirely on the quality of the underlying 2-way diffs. For instance, `move` operations cannot be supported because Myers' algorithm decomposes them into separate `insert` and `delete` operations, regardless of the merging algorithm used.
104
131
 
132
+ ## Comparison with other approaches
133
+
134
+ ### Traditional 3-way merge (diff3, Git)
135
+
136
+ Tools like `diff3` ([4]) and Git produce **conflict markers** (`<<<<<<<` / `=======` / `>>>>>>>`) when both sides modify the same region. This works for source code where a human must verify correctness, but breaks the reading flow for prose. `reconcile-text` uses the same diff3-like foundation but adds an OT-inspired resolution step that eliminates conflict markers entirely. Libraries like [diffy][17], [merge3][18] (Rust), and [node-diff3][19] (JavaScript) all fall into this category.
137
+
138
+ ### diff-match-patch
139
+
140
+ [diff-match-patch][6] is a widely-used library created by Neil Fraser at Google in 2006, providing character-level diffing (Myers' algorithm), fuzzy string matching (Bitap algorithm), and patch application. It powers Fraser's **Differential Synchronisation** protocol ([2]): compute a diff between two texts, apply the patch to a third text that may have drifted, and repeat until convergence. If a patch fails, the failure self-corrects in the next sync cycle.
141
+
142
+ The key differences from `reconcile-text`:
143
+
144
+ - **2-way vs 3-way** - diff-match-patch diffs two texts and applies the result as a patch. It has no concept of a common ancestor and cannot reason about "left changes" vs "right changes". `reconcile-text` performs true 3-way merging, understanding the intent behind each side's edits.
145
+
146
+ - **Character-level only** - Word-level and line-level diffs require encoding tokens as single Unicode characters before diffing ([7]). `reconcile-text` supports word, character, line, and custom tokenisation natively.
147
+
148
+ - **Patches can fail** - `patch_apply` returns a boolean array indicating success per patch; failed patches are silently dropped. In Differential Synchronisation, failures self-correct in the next cycle, but for one-shot merges edits can be lost. `reconcile-text` always produces a complete merged result.
149
+
150
+ - **No cursor tracking or change provenance** - diff-match-patch does not reposition cursors or track which side made which edit. `reconcile-text` does both automatically.
151
+
152
+ See the [comparison example](examples/compare-with-diff-match-patch.rs) for concrete cases where diff-match-patch garbles adjacent edits and silently drops an entire sentence, while `reconcile-text` merges both users' changes correctly.
153
+
154
+ > **When to use diff-match-patch instead**: when you don't have a common ancestor, for example synchronising texts that have diverged through an unknown sequence of edits. If you have a common ancestor (as in most version control and collaborative editing scenarios), `reconcile-text` produces more reliable results.
155
+
156
+ ### CRDTs (Yjs, Automerge, Loro, diamond-types)
157
+
158
+ Conflict-free Replicated Data Types guarantee convergence by mathematical construction: every operation commutes, so the order of application doesn't matter. Libraries like [Yjs][20] (and its Rust port [Yrs][21]), [Automerge][16], [Loro][14], [cola][15], and [diamond-types][22] implement this approach.
159
+
160
+ CRDTs capture every individual keystroke or operation, assigning each a unique identity. This makes them ideal when you control the complete editing infrastructure: the editor, the transport layer, and the storage format. They work peer-to-peer, handle arbitrary numbers of concurrent editors, and never lose an edit.
161
+
162
+ The trade-off is that CRDTs require **maintaining document state over time** - an operation log or internal data structure that grows with the document's edit history. You cannot simply hand a CRDT library three plain strings and get a merged result. This makes them unsuitable for Differential Synchronisation scenarios where you only observe the final state of each document, which is exactly the niche `reconcile-text` fills.
163
+
164
+ > **When to use CRDTs instead**: if you control the complete editing stack and can capture every operation as it happens, CRDTs provide stronger convergence guarantees. They also support more than two concurrent editors naturally, whereas `reconcile-text` merges exactly two forks at a time (though merges can be chained).
165
+
166
+ ### Operational Transformation (OT)
167
+
168
+ OT libraries like [ot.js][23] and [ShareJS][24] transform concurrent operations against each other so that applying them in any order produces the same result. Like CRDTs, they capture individual operations and require infrastructure to coordinate them, typically a central server that determines the canonical operation order.
169
+
170
+ `reconcile-text` borrows the *concept* of OT (transforming one side's edits against the other) but applies it to a different problem. Instead of transforming individual keystrokes in real time, it transforms the consolidated diff output of two complete edits. This means it doesn't need a server, doesn't need to capture operations as they happen, and works entirely offline.
171
+
172
+ > **When to use OT instead**: if you need real-time collaboration with sub-second latency and can run a coordination server, dedicated OT libraries handle this well. `reconcile-text` is designed for merge points, not live keystroke-by-keystroke synchronisation.
173
+
105
174
  ## Development
106
175
 
107
176
  Contributions are welcome!
108
177
 
109
178
  ### Environment
110
179
 
180
+ #### Python setup
181
+
182
+ Install [uv](https://docs.astral.sh/uv/getting-started/installation/) and build the extension for development:
183
+
184
+ ```sh
185
+ cd reconcile-python
186
+ uv run maturin develop
187
+ ```
188
+
111
189
  #### Node.js setup
112
190
 
113
- 1. Install [nvm](https://github.com/nvm-sh/nvm):
191
+ 1. Install [nvm][25]:
114
192
  ```sh
115
193
  curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.40.1/install.sh | bash
116
194
  ```
@@ -118,14 +196,14 @@ Contributions are welcome!
118
196
  ```sh
119
197
  nvm install 22 && nvm use 22
120
198
  ```
121
- 3. Optionally, set as default:
199
+ 3. Optionally, set as default:
122
200
  ```sh
123
201
  nvm alias default 22
124
202
  ```
125
203
 
126
204
  #### Rust toolchain
127
205
 
128
- Install [rustup](https://rustup.rs):
206
+ Install [rustup][26]:
129
207
  ```bash
130
208
  curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
131
209
  ```
@@ -142,8 +220,30 @@ Install [rustup](https://rustup.rs):
142
220
 
143
221
  [MIT](./LICENSE)
144
222
 
145
- [1]:https://marijnhaverbeke.nl/blog/collaborative-editing-cm.html
223
+ [1]: https://marijnhaverbeke.nl/blog/collaborative-editing-cm.html
146
224
  [2]: https://neil.fraser.name/writing/sync/
147
225
  [3]: https://www.cis.upenn.edu/~bcpierce/papers/diff3-short.pdf
148
226
  [4]: https://blog.jcoglan.com/2017/05/08/merging-with-diff3/
149
227
  [5]: https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/35605.pdf
228
+ [6]: https://github.com/google/diff-match-patch
229
+ [7]: https://github.com/google/diff-match-patch/wiki/Line-or-Word-Diffs
230
+ [8]: https://schmelczer.dev/reconcile
231
+ [9]: https://crates.io/crates/reconcile-text
232
+ [10]: https://www.npmjs.com/package/reconcile-text
233
+ [11]: https://docs.rs/reconcile-text/latest/reconcile_text
234
+ [12]: https://github.com/invisible-college/universal-sync
235
+ [13]: https://github.com/spebern/operational-transform-rs
236
+ [14]: https://github.com/loro-dev/loro/
237
+ [15]: https://github.com/nomad/cola
238
+ [16]: https://github.com/automerge/automerge
239
+ [17]: https://crates.io/crates/diffy
240
+ [18]: https://github.com/breezy-team/merge3-rs
241
+ [19]: https://github.com/bhousel/node-diff3
242
+ [20]: https://github.com/yjs/yjs
243
+ [21]: https://github.com/y-crdt/y-crdt
244
+ [22]: https://github.com/josephg/diamond-types
245
+ [23]: https://ot.js.org/
246
+ [24]: https://github.com/josephg/ShareJS
247
+ [25]: https://github.com/nvm-sh/nvm
248
+ [26]: https://rustup.rs
249
+ [27]: https://pypi.org/project/reconcile-text/