daff 1.1.2 → 1.1.5
Sign up to get free protection for your applications and to get access to all the features.
- data/README.md +242 -0
- data/lib/daff.rb +1 -0
- data/lib/lib/coopy/coopy.rb +86 -8
- data/lib/lib/coopy/merger.rb +183 -0
- data/lib/lib/coopy/table.rb +2 -0
- metadata +68 -51
data/README.md
ADDED
@@ -0,0 +1,242 @@
|
|
1
|
+
[![Build Status](https://travis-ci.org/paulfitz/daff.svg?branch=master)](https://travis-ci.org/paulfitz/daff)
|
2
|
+
[![NPM version](https://badge.fury.io/js/daff.svg)](http://badge.fury.io/js/daff)
|
3
|
+
[![Gem Version](https://badge.fury.io/rb/daff.svg)](http://badge.fury.io/rb/daff)
|
4
|
+
[![PyPI version](https://badge.fury.io/py/daff.svg)](http://badge.fury.io/py/daff)
|
5
|
+
|
6
|
+
daff: data diff
|
7
|
+
===============
|
8
|
+
|
9
|
+
This is a library for comparing tables, producing a summary of their
|
10
|
+
differences, and using such a summary as a patch file. It is
|
11
|
+
optimized for comparing tables that share a common origin, in other
|
12
|
+
words multiple versions of the "same" table.
|
13
|
+
|
14
|
+
For a live demo, see:
|
15
|
+
> http://paulfitz.github.com/daff/
|
16
|
+
|
17
|
+
Download the code for your preferred language here:
|
18
|
+
> https://github.com/paulfitz/daff/releases
|
19
|
+
|
20
|
+
For certain languages you can use the command-line:
|
21
|
+
````sh
|
22
|
+
npm install daff # node/javascript
|
23
|
+
pip3 install daff # python3
|
24
|
+
gem install daff # ruby
|
25
|
+
````
|
26
|
+
|
27
|
+
Or use the library to view csv diffs on github via a chrome extension:
|
28
|
+
> https://github.com/theodi/csvhub
|
29
|
+
|
30
|
+
The diff format used by `daff` is specified here:
|
31
|
+
> http://dataprotocols.org/tabular-diff-format/
|
32
|
+
|
33
|
+
This library is a stripped down version of the coopy toolbox (see
|
34
|
+
http://share.find.coop). To compare tables from different origins,
|
35
|
+
or with automatically generated IDs, or other complications, check out
|
36
|
+
the coopy toolbox.
|
37
|
+
|
38
|
+
The program
|
39
|
+
-----------
|
40
|
+
|
41
|
+
You can run `daff`/`daff.py`/`daff.rb` as a utility program:
|
42
|
+
````
|
43
|
+
$ daff
|
44
|
+
daff can produce and apply tabular diffs.
|
45
|
+
Call as:
|
46
|
+
daff [--output OUTPUT.csv] a.csv b.csv
|
47
|
+
daff [--output OUTPUT.csv] parent.csv a.csv b.csv
|
48
|
+
daff [--output OUTPUT.jsonbook] a.jsonbook b.jsonbook
|
49
|
+
daff patch [--output OUTPUT.csv] source.csv patch.csv
|
50
|
+
daff trim [--output OUTPUT.csv] source.csv
|
51
|
+
daff render [--output OUTPUT.html] diff.csv
|
52
|
+
|
53
|
+
If you need more control, here is the full list of flags:
|
54
|
+
daff diff [--output OUTPUT.csv] [--context NUM] [--all] [--act ACT] a.csv b.csv
|
55
|
+
--context NUM: show NUM rows of context
|
56
|
+
--all: do not prune unchanged rows
|
57
|
+
--act ACT: show only a certain kind of change (update, insert, delete)
|
58
|
+
|
59
|
+
daff render [--output OUTPUT.html] [--css CSS.css] [--fragment] [--plain] diff.csv
|
60
|
+
--css CSS.css: generate a suitable css file to go with the html
|
61
|
+
--fragment: generate just a html fragment rather than a page
|
62
|
+
--plain: do not use fancy utf8 characters to make arrows prettier
|
63
|
+
````
|
64
|
+
|
65
|
+
Using with git
|
66
|
+
--------------
|
67
|
+
|
68
|
+
Run `daff git csv` to see how to use daff to improve `git`'s handling
|
69
|
+
of csv files.
|
70
|
+
|
71
|
+
````
|
72
|
+
$ daff git csv
|
73
|
+
You can use daff to improve git's handling of csv files, by using it as a
|
74
|
+
diff driver (for showing what has changed) and as a merge driver (for merging
|
75
|
+
changes between multiple versions). Here is how.
|
76
|
+
|
77
|
+
Create and add a file called .gitattributes in the root directory of your
|
78
|
+
repository, containing:
|
79
|
+
|
80
|
+
*.csv diff=daff-diff
|
81
|
+
*.csv merge=daff-merge
|
82
|
+
|
83
|
+
Create a file called .gitconfig in your home directory (or alternatively
|
84
|
+
open .git/config for a particular repository) and add:
|
85
|
+
|
86
|
+
[merge "daff-merge"]
|
87
|
+
name = daff tabular merge
|
88
|
+
driver = daff merge --output %A %O %A %B
|
89
|
+
|
90
|
+
[diff "daff-diff"]
|
91
|
+
command = daff diff --git
|
92
|
+
|
93
|
+
Make sure you can run daff from the command-line as just "daff" - if not,
|
94
|
+
replace "daff" in the driver and command lines above with the correct way
|
95
|
+
to call it.
|
96
|
+
````
|
97
|
+
|
98
|
+
The library
|
99
|
+
-----------
|
100
|
+
|
101
|
+
You can use `daff` as a library from any supported language. We take
|
102
|
+
here the example of Javascript. To use `daff` on a webpage,
|
103
|
+
first include `daff.js`:
|
104
|
+
```html
|
105
|
+
<script src="daff.js"></script>
|
106
|
+
```
|
107
|
+
Or if using node outside the browser:
|
108
|
+
```js
|
109
|
+
var daff = require('daff');
|
110
|
+
```
|
111
|
+
|
112
|
+
For concreteness, assume we have two versions of a table,
|
113
|
+
`data1` and `data2`:
|
114
|
+
```js
|
115
|
+
var data1 = [
|
116
|
+
['Country','Capital'],
|
117
|
+
['Ireland','Dublin'],
|
118
|
+
['France','Paris'],
|
119
|
+
['Spain','Barcelona']
|
120
|
+
];
|
121
|
+
var data2 = [
|
122
|
+
['Country','Code','Capital'],
|
123
|
+
['Ireland','ie','Dublin'],
|
124
|
+
['France','fr','Paris'],
|
125
|
+
['Spain','es','Madrid'],
|
126
|
+
['Germany','de','Berlin']
|
127
|
+
];
|
128
|
+
```
|
129
|
+
|
130
|
+
To make those tables accessible to the library, we wrap them
|
131
|
+
in `daff.TableView`:
|
132
|
+
```js
|
133
|
+
var table1 = new daff.TableView(data1);
|
134
|
+
var table2 = new daff.TableView(data2);
|
135
|
+
```
|
136
|
+
|
137
|
+
We can now compute the alignment between the rows and columns
|
138
|
+
in the two tables:
|
139
|
+
```js
|
140
|
+
var alignment = daff.compareTables(table1,table2).align();
|
141
|
+
```
|
142
|
+
|
143
|
+
To produce a diff from the alignment, we first need a table
|
144
|
+
for the output:
|
145
|
+
```js
|
146
|
+
var data_diff = [];
|
147
|
+
var table_diff = new daff.TableView(data_diff);
|
148
|
+
```
|
149
|
+
|
150
|
+
Using default options for the diff:
|
151
|
+
```js
|
152
|
+
var flags = new daff.CompareFlags();
|
153
|
+
var highlighter = new daff.TableDiff(alignment,flags);
|
154
|
+
highlighter.hilite(table_diff);
|
155
|
+
```
|
156
|
+
|
157
|
+
The diff is now in `data_diff` in highlighter format, see
|
158
|
+
specification here:
|
159
|
+
> http://share.find.coop/doc/spec_hilite.html
|
160
|
+
|
161
|
+
```js
|
162
|
+
[ [ '!', '', '+++', '' ],
|
163
|
+
[ '@@', 'Country', 'Code', 'Capital' ],
|
164
|
+
[ '+', 'Ireland', 'ie', 'Dublin' ],
|
165
|
+
[ '+', 'France', 'fr', 'Paris' ],
|
166
|
+
[ '->', 'Spain', 'es', 'Barcelona->Madrid' ],
|
167
|
+
[ '+++', 'Germany', 'de', 'Berlin' ] ]
|
168
|
+
```
|
169
|
+
|
170
|
+
For visualization, you may want to convert this to a HTML table
|
171
|
+
with appropriate classes on cells so you can color-code inserts,
|
172
|
+
deletes, updates, etc. You can do this with:
|
173
|
+
```js
|
174
|
+
var diff2html = new daff.DiffRender();
|
175
|
+
diff2html.render(table_diff);
|
176
|
+
var table_diff_html = diff2html.html();
|
177
|
+
```
|
178
|
+
|
179
|
+
For 3-way differences (that is, comparing two tables given knowledge
|
180
|
+
of a common ancestor) use `daff.compareTables3` (give ancestor
|
181
|
+
table as the first argument).
|
182
|
+
|
183
|
+
Here is how to apply that difference as a patch:
|
184
|
+
```js
|
185
|
+
var patcher = new daff.HighlightPatch(table1,table_diff);
|
186
|
+
patcher.apply();
|
187
|
+
// table1 should now equal table2
|
188
|
+
```
|
189
|
+
|
190
|
+
For other languages, you should find sample code in
|
191
|
+
the packages on the [Releases](https://github.com/paulfitz/daff/releases) page.
|
192
|
+
|
193
|
+
|
194
|
+
Supported languages
|
195
|
+
-------------------
|
196
|
+
|
197
|
+
The `daff` library is written in [Haxe](http://haxe.org/), which
|
198
|
+
can be translated reasonably well into at least the following languages:
|
199
|
+
|
200
|
+
* Javascript
|
201
|
+
* PHP
|
202
|
+
* Python
|
203
|
+
* Java
|
204
|
+
* C#
|
205
|
+
* C++
|
206
|
+
* (via a hack, just for `daff`) Ruby
|
207
|
+
|
208
|
+
Some translations are done for you on the
|
209
|
+
[Releases](https://github.com/paulfitz/daff/releases) page.
|
210
|
+
To make another translation,
|
211
|
+
follow the
|
212
|
+
[Haxe getting started tutorial](http://haxe.org/doc/start) for the
|
213
|
+
language you care about, then do one of:
|
214
|
+
|
215
|
+
```
|
216
|
+
make js
|
217
|
+
make php
|
218
|
+
make py
|
219
|
+
make java
|
220
|
+
make cs
|
221
|
+
make cpp
|
222
|
+
```
|
223
|
+
|
224
|
+
[@Floppy](https://github.com/Floppy) has made a lovingly-hand-written [native Ruby port](https://github.com/theodi/coopy-ruby) that covers core functionality. I've made a brutally-machine-converted port that is a full translation but less idiomatic.
|
225
|
+
|
226
|
+
For each language, the `daff` library expects to be handed an interface to tables you create, rather than creating them
|
227
|
+
itself. This is to avoid inefficient copies from one format to another. You'll find a `SimpleTable` class you can use if
|
228
|
+
you find this awkward.
|
229
|
+
|
230
|
+
Reading material
|
231
|
+
----------------
|
232
|
+
|
233
|
+
* http://dataprotocols.org/tabular-diff-format/ : a specification of the diff format we use.
|
234
|
+
* http://theodi.org/blog/csvhub-github-diffs-for-csv-files : using this library with github.
|
235
|
+
* http://theodi.org/blog/adapting-git-simple-data : using this library with gitlab.
|
236
|
+
* http://okfnlabs.org/blog/2013/08/08/diffing-and-patching-data.html : a summary of where the library came from.
|
237
|
+
* http://blog.okfn.org/2013/07/02/git-and-github-for-data/ : a post about storing small data in git/github.
|
238
|
+
* http://blog.ouseful.info/2013/08/27/diff-or-chop-github-csv-data-files-and-openrefine/ : counterpoint - a post discussing tracked-changes rather than diffs.
|
239
|
+
|
240
|
+
## License
|
241
|
+
|
242
|
+
daff is distributed under the MIT License.
|
data/lib/daff.rb
CHANGED
@@ -36,6 +36,7 @@ require_relative 'lib/coopy/highlight_patch_unit'
|
|
36
36
|
require_relative 'lib/coopy/index'
|
37
37
|
require_relative 'lib/coopy/index_item'
|
38
38
|
require_relative 'lib/coopy/index_pair'
|
39
|
+
require_relative 'lib/coopy/merger'
|
39
40
|
require_relative 'lib/coopy/mover'
|
40
41
|
require_relative 'lib/coopy/ordering'
|
41
42
|
require_relative 'lib/coopy/report'
|
data/lib/lib/coopy/coopy.rb
CHANGED
@@ -223,6 +223,8 @@ module Coopy
|
|
223
223
|
css_output = nil
|
224
224
|
fragment = false
|
225
225
|
pretty = true
|
226
|
+
inplace = false
|
227
|
+
git = false
|
226
228
|
flags = ::Coopy::CompareFlags.new
|
227
229
|
flags.always_show_header = true
|
228
230
|
while(more)
|
@@ -275,6 +277,16 @@ module Coopy
|
|
275
277
|
flags.unchanged_context = context if context >= 0
|
276
278
|
args.slice!(i,2)
|
277
279
|
break
|
280
|
+
elsif tag == "--inplace"
|
281
|
+
more = true
|
282
|
+
inplace = true
|
283
|
+
args.slice!(i,1)
|
284
|
+
break
|
285
|
+
elsif tag == "--git"
|
286
|
+
more = true
|
287
|
+
git = true
|
288
|
+
args.slice!(i,1)
|
289
|
+
break
|
278
290
|
end
|
279
291
|
end
|
280
292
|
end
|
@@ -286,9 +298,13 @@ module Coopy
|
|
286
298
|
io.write_stderr(" daff [--output OUTPUT.csv] a.csv b.csv\n")
|
287
299
|
io.write_stderr(" daff [--output OUTPUT.csv] parent.csv a.csv b.csv\n")
|
288
300
|
io.write_stderr(" daff [--output OUTPUT.jsonbook] a.jsonbook b.jsonbook\n")
|
289
|
-
io.write_stderr(" daff patch [--output OUTPUT.csv]
|
301
|
+
io.write_stderr(" daff patch [--inplace] [--output OUTPUT.csv] a.csv patch.csv\n")
|
302
|
+
io.write_stderr(" daff merge [--inplace] [--output OUTPUT.csv] parent.csv a.csv b.csv\n")
|
290
303
|
io.write_stderr(" daff trim [--output OUTPUT.csv] source.csv\n")
|
291
304
|
io.write_stderr(" daff render [--output OUTPUT.html] diff.csv\n")
|
305
|
+
io.write_stderr(" daff git csv\n")
|
306
|
+
io.write_stderr("\n")
|
307
|
+
io.write_stderr("The --inplace option to patch and merge will result in modification of a.csv.\n")
|
292
308
|
io.write_stderr("\n")
|
293
309
|
io.write_stderr("If you need more control, here is the full list of flags:\n")
|
294
310
|
io.write_stderr(" daff diff [--output OUTPUT.csv] [--context NUM] [--all] [--act ACT] a.csv b.csv\n")
|
@@ -296,21 +312,65 @@ module Coopy
|
|
296
312
|
io.write_stderr(" --all: do not prune unchanged rows\n")
|
297
313
|
io.write_stderr(" --act ACT: show only a certain kind of change (update, insert, delete)\n")
|
298
314
|
io.write_stderr("\n")
|
315
|
+
io.write_stderr(" daff diff --git path old-file old-hex old-mode new-file new-hex new-mode\n")
|
316
|
+
io.write_stderr(" --git: process arguments provided by git to diff drivers\n")
|
317
|
+
io.write_stderr("\n")
|
299
318
|
io.write_stderr(" daff render [--output OUTPUT.html] [--css CSS.css] [--fragment] [--plain] diff.csv\n")
|
300
319
|
io.write_stderr(" --css CSS.css: generate a suitable css file to go with the html\n")
|
301
320
|
io.write_stderr(" --fragment: generate just a html fragment rather than a page\n")
|
302
321
|
io.write_stderr(" --plain: do not use fancy utf8 characters to make arrows prettier\n")
|
303
322
|
return 1
|
304
323
|
end
|
305
|
-
output = "-" if output == nil
|
306
324
|
cmd1 = args[0]
|
307
325
|
offset = 1
|
308
|
-
if !Lambda.has(["diff","patch","trim","render"],cmd1)
|
326
|
+
if !Lambda.has(["diff","patch","merge","trim","render","git"],cmd1)
|
309
327
|
if (cmd1.index(".",nil || 0) || -1) != -1 || (cmd1.index("--",nil || 0) || -1) == 0
|
310
328
|
cmd1 = "diff"
|
311
329
|
offset = 0
|
312
330
|
end
|
313
331
|
end
|
332
|
+
if cmd1 == "git"
|
333
|
+
types = args.slice!(offset,args.length - offset)
|
334
|
+
io.write_stdout("You can use daff to improve git's handling of csv files, by using it as a\ndiff driver (for showing what has changed) and as a merge driver (for merging\nchanges between multiple versions). Here is how.\n")
|
335
|
+
io.write_stdout("\n")
|
336
|
+
io.write_stdout("Create and add a file called .gitattributes in the root directory of your\nrepository, containing:\n\n")
|
337
|
+
begin
|
338
|
+
_g2 = 0
|
339
|
+
while(_g2 < types.length)
|
340
|
+
t = types[_g2]
|
341
|
+
_g2+=1
|
342
|
+
io.write_stdout(" *." + _hx_str(t) + " diff=daff-diff\n")
|
343
|
+
io.write_stdout(" *." + _hx_str(t) + " merge=daff-merge\n")
|
344
|
+
end
|
345
|
+
end
|
346
|
+
io.write_stdout("\nCreate a file called .gitconfig in your home directory (or alternatively\nopen .git/config for a particular repository) and add:\n\n")
|
347
|
+
io.write_stdout(" [merge \"daff-merge\"]\n")
|
348
|
+
io.write_stdout(" name = daff tabular merge\n")
|
349
|
+
io.write_stdout(" driver = daff merge --output %A %O %A %B\n\n")
|
350
|
+
io.write_stdout(" [diff \"daff-diff\"]\n")
|
351
|
+
io.write_stdout(" command = daff diff --git\n")
|
352
|
+
io.write_stderr("\n")
|
353
|
+
io.write_stderr("Make sure you can run daff from the command-line as just \"daff\" - if not,\nreplace \"daff\" in the driver and command lines above with the correct way\nto call it.")
|
354
|
+
io.write_stderr("\n")
|
355
|
+
return 0
|
356
|
+
end
|
357
|
+
if git
|
358
|
+
ct = args.length - offset
|
359
|
+
if ct != 7
|
360
|
+
io.write_stderr("Expected 7 parameters from git, but got " + _hx_str(ct) + "\n")
|
361
|
+
return 1
|
362
|
+
end
|
363
|
+
git_args = args.slice!(offset,ct)
|
364
|
+
args.slice!(0,args.length)
|
365
|
+
offset = 0
|
366
|
+
path = git_args[0]
|
367
|
+
old_file = git_args[1]
|
368
|
+
new_file = git_args[4]
|
369
|
+
io.write_stdout("--- a/" + _hx_str(path) + "\n")
|
370
|
+
io.write_stdout("+++ b/" + _hx_str(path) + "\n")
|
371
|
+
args.push(old_file)
|
372
|
+
args.push(new_file)
|
373
|
+
end
|
314
374
|
tool = ::Coopy::Coopy.new
|
315
375
|
tool.io = io
|
316
376
|
parent = nil
|
@@ -318,12 +378,20 @@ module Coopy
|
|
318
378
|
parent = tool.load_table(args[offset])
|
319
379
|
offset+=1
|
320
380
|
end
|
321
|
-
|
381
|
+
aname = args[offset]
|
382
|
+
a = tool.load_table(aname)
|
322
383
|
b = nil
|
323
384
|
b = tool.load_table(args[1 + offset]) if args.length - offset >= 2
|
385
|
+
if inplace
|
386
|
+
io.write_stderr("Please do not use --inplace when specifying an output.\n") if output != nil
|
387
|
+
output = aname
|
388
|
+
return 1
|
389
|
+
end
|
390
|
+
output = "-" if output == nil
|
391
|
+
ok = true
|
324
392
|
if cmd1 == "diff"
|
325
|
-
|
326
|
-
align =
|
393
|
+
ct1 = ::Coopy::Coopy.compare_tables3(parent,a,b)
|
394
|
+
align = ct1.align
|
327
395
|
td = ::Coopy::TableDiff.new(align,flags)
|
328
396
|
o = ::Coopy::SimpleTable.new(0,0)
|
329
397
|
td.hilite(o)
|
@@ -332,6 +400,12 @@ module Coopy
|
|
332
400
|
patcher = ::Coopy::HighlightPatch.new(a,b)
|
333
401
|
patcher.apply
|
334
402
|
tool.save_table(output,a)
|
403
|
+
elsif cmd1 == "merge"
|
404
|
+
merger = ::Coopy::Merger.new(parent,a,b,flags)
|
405
|
+
conflicts = merger.apply
|
406
|
+
ok = conflicts == 0
|
407
|
+
io.write_stderr(_hx_str(conflicts) + " conflict" + _hx_str((((conflicts > 1) ? "s" : ""))) + "\n") if conflicts > 0
|
408
|
+
tool.save_table(output,a)
|
335
409
|
elsif cmd1 == "trim"
|
336
410
|
tool.save_table(output,a)
|
337
411
|
elsif cmd1 == "render"
|
@@ -342,7 +416,11 @@ module Coopy
|
|
342
416
|
tool.save_text(output,renderer.html)
|
343
417
|
tool.save_text(css_output,renderer.sample_css) if css_output != nil
|
344
418
|
end
|
345
|
-
|
419
|
+
if ok
|
420
|
+
return 0
|
421
|
+
else
|
422
|
+
return 1
|
423
|
+
end
|
346
424
|
end
|
347
425
|
|
348
426
|
def Coopy.main
|
@@ -374,7 +452,7 @@ module Coopy
|
|
374
452
|
txt += "\n"
|
375
453
|
end
|
376
454
|
end
|
377
|
-
::Haxe::Log._trace.call(txt,{ file_name: "Coopy.hx", line_number:
|
455
|
+
::Haxe::Log._trace.call(txt,{ file_name: "Coopy.hx", line_number: 432, class_name: "coopy.Coopy", method_name: "show"})
|
378
456
|
end
|
379
457
|
|
380
458
|
def Coopy.jsonify(t)
|
@@ -0,0 +1,183 @@
|
|
1
|
+
#!/usr/bin/env ruby
|
2
|
+
# encoding: utf-8
|
3
|
+
|
4
|
+
module Coopy
|
5
|
+
class Merger
|
6
|
+
|
7
|
+
def initialize(parent,local,remote,flags)
|
8
|
+
@parent = parent
|
9
|
+
@local = local
|
10
|
+
@remote = remote
|
11
|
+
@flags = flags
|
12
|
+
end
|
13
|
+
|
14
|
+
# protected - in ruby this doesn't play well with static/inline methods
|
15
|
+
|
16
|
+
attr_accessor :parent
|
17
|
+
attr_accessor :local
|
18
|
+
attr_accessor :remote
|
19
|
+
attr_accessor :flags
|
20
|
+
attr_accessor :order
|
21
|
+
attr_accessor :units
|
22
|
+
attr_accessor :column_order
|
23
|
+
attr_accessor :column_units
|
24
|
+
attr_accessor :row_mix_local
|
25
|
+
attr_accessor :row_mix_remote
|
26
|
+
attr_accessor :column_mix_local
|
27
|
+
attr_accessor :column_mix_remote
|
28
|
+
attr_accessor :conflicts
|
29
|
+
|
30
|
+
public
|
31
|
+
|
32
|
+
def shuffle_dimension(dim_units,len,fate,cl,cr)
|
33
|
+
at = 0
|
34
|
+
begin
|
35
|
+
_g = 0
|
36
|
+
while(_g < dim_units.length)
|
37
|
+
cunit = dim_units[_g]
|
38
|
+
_g+=1
|
39
|
+
if cunit.p < 0
|
40
|
+
if cunit.l < 0
|
41
|
+
if cunit.r >= 0
|
42
|
+
begin
|
43
|
+
cr[cunit.r] = at
|
44
|
+
at
|
45
|
+
end
|
46
|
+
at+=1
|
47
|
+
end
|
48
|
+
else
|
49
|
+
begin
|
50
|
+
cl[cunit.l] = at
|
51
|
+
at
|
52
|
+
end
|
53
|
+
at+=1
|
54
|
+
end
|
55
|
+
elsif cunit.l >= 0
|
56
|
+
if cunit.r < 0
|
57
|
+
else
|
58
|
+
begin
|
59
|
+
cl[cunit.l] = at
|
60
|
+
at
|
61
|
+
end
|
62
|
+
at+=1
|
63
|
+
end
|
64
|
+
end
|
65
|
+
end
|
66
|
+
end
|
67
|
+
begin
|
68
|
+
_g1 = 0
|
69
|
+
while(_g1 < len)
|
70
|
+
x = _g1
|
71
|
+
_g1+=1
|
72
|
+
idx = cl[x]
|
73
|
+
if idx == nil
|
74
|
+
fate.push(-1)
|
75
|
+
else
|
76
|
+
fate.push(idx)
|
77
|
+
end
|
78
|
+
end
|
79
|
+
end
|
80
|
+
return at
|
81
|
+
end
|
82
|
+
|
83
|
+
def shuffle_columns
|
84
|
+
@column_mix_local = {}
|
85
|
+
@column_mix_remote = {}
|
86
|
+
fate = Array.new
|
87
|
+
wfate = self.shuffle_dimension(@column_units,@local.get_width,fate,@column_mix_local,@column_mix_remote)
|
88
|
+
@local.insert_or_delete_columns(fate,wfate)
|
89
|
+
end
|
90
|
+
|
91
|
+
def shuffle_rows
|
92
|
+
@row_mix_local = {}
|
93
|
+
@row_mix_remote = {}
|
94
|
+
fate = Array.new
|
95
|
+
hfate = self.shuffle_dimension(@units,@local.get_height,fate,@row_mix_local,@row_mix_remote)
|
96
|
+
@local.insert_or_delete_rows(fate,hfate)
|
97
|
+
end
|
98
|
+
|
99
|
+
def apply
|
100
|
+
@conflicts = 0
|
101
|
+
ct = ::Coopy::Coopy.compare_tables3(@parent,@local,@remote)
|
102
|
+
align = ct.align
|
103
|
+
@order = align.to_order_pruned(true)
|
104
|
+
@units = @order.get_list
|
105
|
+
@column_order = align.meta.to_order_pruned(false)
|
106
|
+
@column_units = @column_order.get_list
|
107
|
+
allow_insert = @flags.allow_insert
|
108
|
+
allow_delete = @flags.allow_delete
|
109
|
+
allow_update = @flags.allow_update
|
110
|
+
view = @parent.get_cell_view
|
111
|
+
begin
|
112
|
+
_g = 0
|
113
|
+
_g1 = @units
|
114
|
+
while(_g < _g1.length)
|
115
|
+
row = _g1[_g]
|
116
|
+
_g+=1
|
117
|
+
if row.l >= 0 && row.r >= 0 && row.p >= 0
|
118
|
+
_g2 = 0
|
119
|
+
_g3 = @column_units
|
120
|
+
while(_g2 < _g3.length)
|
121
|
+
col = _g3[_g2]
|
122
|
+
_g2+=1
|
123
|
+
if col.l >= 0 && col.r >= 0 && col.p >= 0
|
124
|
+
pcell = @parent.get_cell(col.p,row.p)
|
125
|
+
rcell = @remote.get_cell(col.r,row.r)
|
126
|
+
if !view.equals(pcell,rcell)
|
127
|
+
lcell = @local.get_cell(col.l,row.l)
|
128
|
+
if view.equals(pcell,lcell)
|
129
|
+
@local.set_cell(col.l,row.l,rcell)
|
130
|
+
else
|
131
|
+
@local.set_cell(col.l,row.l,::Coopy::Merger.make_conflicted_cell(view,pcell,lcell,rcell))
|
132
|
+
@conflicts+=1
|
133
|
+
end
|
134
|
+
end
|
135
|
+
end
|
136
|
+
end
|
137
|
+
end
|
138
|
+
end
|
139
|
+
end
|
140
|
+
self.shuffle_columns
|
141
|
+
self.shuffle_rows
|
142
|
+
_it = ::Rb::RubyIterator.new(@column_mix_remote.keys)
|
143
|
+
while(_it.has_next) do
|
144
|
+
x = _it._next
|
145
|
+
x2 = @column_mix_remote[x]
|
146
|
+
begin
|
147
|
+
_g4 = 0
|
148
|
+
_g11 = @units
|
149
|
+
while(_g4 < _g11.length)
|
150
|
+
unit = _g11[_g4]
|
151
|
+
_g4+=1
|
152
|
+
if unit.l >= 0 && unit.r >= 0
|
153
|
+
@local.set_cell(x2,@row_mix_local[unit.l],@remote.get_cell(x,unit.r))
|
154
|
+
elsif unit.p < 0 && unit.r >= 0
|
155
|
+
@local.set_cell(x2,@row_mix_remote[unit.r],@remote.get_cell(x,unit.r))
|
156
|
+
end
|
157
|
+
end
|
158
|
+
end
|
159
|
+
end
|
160
|
+
_it2 = ::Rb::RubyIterator.new(@row_mix_remote.keys)
|
161
|
+
while(_it2.has_next) do
|
162
|
+
y = _it2._next
|
163
|
+
y2 = @row_mix_remote[y]
|
164
|
+
begin
|
165
|
+
_g5 = 0
|
166
|
+
_g12 = @column_units
|
167
|
+
while(_g5 < _g12.length)
|
168
|
+
unit1 = _g12[_g5]
|
169
|
+
_g5+=1
|
170
|
+
@local.set_cell(@column_mix_local[unit1.l],y2,@remote.get_cell(unit1.r,y)) if unit1.l >= 0 && unit1.r >= 0
|
171
|
+
end
|
172
|
+
end
|
173
|
+
end
|
174
|
+
return @conflicts
|
175
|
+
end
|
176
|
+
|
177
|
+
def Merger.make_conflicted_cell(view,pcell,lcell,rcell)
|
178
|
+
return view.to_datum("((( " + _hx_str(view.to_s(pcell)) + " ))) " + _hx_str(view.to_s(lcell)) + " /// " + _hx_str(view.to_s(rcell)))
|
179
|
+
end
|
180
|
+
|
181
|
+
end
|
182
|
+
|
183
|
+
end
|
data/lib/lib/coopy/table.rb
CHANGED
@@ -12,6 +12,8 @@ module Coopy
|
|
12
12
|
def insertOrDeleteRows(fate,hfate) puts "Abstract Table.insertOrDeleteRows called" end
|
13
13
|
def insertOrDeleteColumns(fate,wfate) puts "Abstract Table.insertOrDeleteColumns called" end
|
14
14
|
def trimBlank() puts "Abstract Table.trimBlank called" end
|
15
|
+
def get_width() puts "Abstract Table.get_width called" end
|
16
|
+
def get_height() puts "Abstract Table.get_height called" end
|
15
17
|
end
|
16
18
|
|
17
19
|
end
|
metadata
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: daff
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 1.1.
|
4
|
+
version: 1.1.5
|
5
5
|
prerelease:
|
6
6
|
platform: ruby
|
7
7
|
authors:
|
@@ -10,7 +10,7 @@ authors:
|
|
10
10
|
autorequire:
|
11
11
|
bindir: bin
|
12
12
|
cert_chain: []
|
13
|
-
date: 2014-
|
13
|
+
date: 2014-07-10 00:00:00.000000000 Z
|
14
14
|
dependencies: []
|
15
15
|
description: Diff and patch tables
|
16
16
|
email:
|
@@ -61,6 +61,7 @@ files:
|
|
61
61
|
- lib/lib/coopy/table_comparison_state.rb
|
62
62
|
- lib/lib/coopy/csv.rb
|
63
63
|
- lib/lib/coopy/ordering.rb
|
64
|
+
- lib/lib/coopy/merger.rb
|
64
65
|
- lib/lib/coopy/change.rb
|
65
66
|
- lib/lib/coopy/sparse_sheet.rb
|
66
67
|
- lib/lib/coopy/report.rb
|
@@ -82,6 +83,7 @@ files:
|
|
82
83
|
- lib/lib/haxe/format/json_printer.rb
|
83
84
|
- lib/lib/sys.rb
|
84
85
|
- bin/daff.rb
|
86
|
+
- README.md
|
85
87
|
homepage: https://github.com/paulfitz/daff
|
86
88
|
licenses:
|
87
89
|
- MIT
|
@@ -106,44 +108,59 @@ rubyforge_project:
|
|
106
108
|
rubygems_version: 1.8.23
|
107
109
|
signing_key:
|
108
110
|
specification_version: 3
|
109
|
-
summary: ! 'daff
|
110
|
-
|
111
|
-
|
112
|
-
|
113
|
-
|
114
|
-
|
115
|
-
|
116
|
-
a
|
117
|
-
|
111
|
+
summary: ! '[![Build Status](https://travis-ci.org/paulfitz/daff.svg?branch=master)](https://travis-ci.org/paulfitz/daff)
|
112
|
+
[![NPM version](https://badge.fury.io/js/daff.svg)](http://badge.fury.io/js/daff)
|
113
|
+
[![Gem Version](https://badge.fury.io/rb/daff.svg)](http://badge.fury.io/rb/daff)
|
114
|
+
[![PyPI version](https://badge.fury.io/py/daff.svg)](http://badge.fury.io/py/daff) daff:
|
115
|
+
data diff =============== This is a library for comparing tables, producing a summary
|
116
|
+
of their differences, and using such a summary as a patch file. It is optimized
|
117
|
+
for comparing tables that share a common origin, in other words multiple versions
|
118
|
+
of the "same" table. For a live demo, see: > http://paulfitz.github.com/daff/ Download
|
119
|
+
the code for your preferred language here: > https://github.com/paulfitz/daff/releases For
|
120
|
+
certain languages you can use the command-line: ````sh npm install daff # node/javascript
|
121
|
+
pip3 install daff # python3 gem install daff # ruby ```` Or use the library
|
122
|
+
to view csv diffs on github via a chrome extension: > https://github.com/theodi/csvhub The
|
123
|
+
diff format used by `daff` is specified here: > http://dataprotocols.org/tabular-diff-format/ This
|
118
124
|
library is a stripped down version of the coopy toolbox (see http://share.find.coop). To
|
119
125
|
compare tables from different origins, or with automatically generated IDs, or
|
120
|
-
other complications, check out the coopy toolbox. The program -----------
|
121
|
-
|
122
|
-
|
123
|
-
|
124
|
-
|
125
|
-
|
126
|
-
|
127
|
-
[--
|
128
|
-
|
129
|
-
|
130
|
-
|
131
|
-
|
132
|
-
|
133
|
-
|
134
|
-
|
135
|
-
|
136
|
-
|
137
|
-
|
138
|
-
|
139
|
-
|
140
|
-
|
141
|
-
=
|
142
|
-
|
143
|
-
|
144
|
-
|
145
|
-
|
146
|
-
the
|
126
|
+
other complications, check out the coopy toolbox. The program ----------- You
|
127
|
+
can run `daff`/`daff.py`/`daff.rb` as a utility program: ```` $ daff daff can produce
|
128
|
+
and apply tabular diffs. Call as: daff [--output OUTPUT.csv] a.csv b.csv daff [--output
|
129
|
+
OUTPUT.csv] parent.csv a.csv b.csv daff [--output OUTPUT.jsonbook] a.jsonbook b.jsonbook
|
130
|
+
daff patch [--output OUTPUT.csv] source.csv patch.csv daff trim [--output OUTPUT.csv]
|
131
|
+
source.csv daff render [--output OUTPUT.html] diff.csv If you need more control,
|
132
|
+
here is the full list of flags: daff diff [--output OUTPUT.csv] [--context NUM]
|
133
|
+
[--all] [--act ACT] a.csv b.csv --context NUM: show NUM rows of context --all: do
|
134
|
+
not prune unchanged rows --act ACT: show only a certain kind of change (update,
|
135
|
+
insert, delete) daff render [--output OUTPUT.html] [--css CSS.css] [--fragment]
|
136
|
+
[--plain] diff.csv --css CSS.css: generate a suitable css file to go with the html
|
137
|
+
--fragment: generate just a html fragment rather than a page --plain: do
|
138
|
+
not use fancy utf8 characters to make arrows prettier ```` Using with git -------------- Run
|
139
|
+
`daff git csv` to see how to use daff to improve `git`''s handling of csv files. ````
|
140
|
+
$ daff git csv You can use daff to improve git''s handling of csv files, by using
|
141
|
+
it as a diff driver (for showing what has changed) and as a merge driver (for merging
|
142
|
+
changes between multiple versions). Here is how. Create and add a file called
|
143
|
+
.gitattributes in the root directory of your repository, containing: *.csv diff=daff-diff
|
144
|
+
*.csv merge=daff-merge Create a file called .gitconfig in your home directory (or
|
145
|
+
alternatively open .git/config for a particular repository) and add: [merge "daff-merge"]
|
146
|
+
name = daff tabular merge driver = daff merge --output %A %O %A %B [diff "daff-diff"]
|
147
|
+
command = daff diff --git Make sure you can run daff from the command-line as just
|
148
|
+
"daff" - if not, replace "daff" in the driver and command lines above with the correct
|
149
|
+
way to call it. ```` The library ----------- You can use `daff` as a library from
|
150
|
+
any supported language. We take here the example of Javascript. To use `daff`
|
151
|
+
on a webpage, first include `daff.js`: ```html <script src="daff.js"></script> ```
|
152
|
+
Or if using node outside the browser: ```js var daff = require(''daff''); ``` For
|
153
|
+
concreteness, assume we have two versions of a table, `data1` and `data2`: ```js
|
154
|
+
var data1 = [ [''Country'',''Capital''], [''Ireland'',''Dublin''], [''France'',''Paris''],
|
155
|
+
[''Spain'',''Barcelona''] ]; var data2 = [ [''Country'',''Code'',''Capital''], [''Ireland'',''ie'',''Dublin''],
|
156
|
+
[''France'',''fr'',''Paris''], [''Spain'',''es'',''Madrid''], [''Germany'',''de'',''Berlin'']
|
157
|
+
]; ``` To make those tables accessible to the library, we wrap them in `daff.TableView`:
|
158
|
+
```js var table1 = new daff.TableView(data1); var table2 = new daff.TableView(data2);
|
159
|
+
``` We can now compute the alignment between the rows and columns in the two tables:
|
160
|
+
```js var alignment = daff.compareTables(table1,table2).align(); ``` To produce
|
161
|
+
a diff from the alignment, we first need a table for the output: ```js var data_diff
|
162
|
+
= []; var table_diff = new daff.TableView(data_diff); ``` Using default options
|
163
|
+
for the diff: ```js var flags = new daff.CompareFlags(); var highlighter = new daff.TableDiff(alignment,flags);
|
147
164
|
highlighter.hilite(table_diff); ``` The diff is now in `data_diff` in highlighter
|
148
165
|
format, see specification here: > http://share.find.coop/doc/spec_hilite.html ```js
|
149
166
|
[ [ ''!'', '''', ''+++'', '''' ], [ ''@@'', ''Country'', ''Code'', ''Capital'' ],
|
@@ -156,22 +173,22 @@ summary: ! 'daff: data diff =============== This is a library for comparing tab
|
|
156
173
|
differences (that is, comparing two tables given knowledge of a common ancestor)
|
157
174
|
use `daff.compareTables3` (give ancestor table as the first argument). Here is
|
158
175
|
how to apply that difference as a patch: ```js var patcher = new daff.HighlightPatch(table1,table_diff);
|
159
|
-
patcher.apply(); // table1 should now equal table2 ```
|
160
|
-
|
161
|
-
|
162
|
-
|
163
|
-
|
164
|
-
|
176
|
+
patcher.apply(); // table1 should now equal table2 ``` For other languages, you
|
177
|
+
should find sample code in the packages on the [Releases](https://github.com/paulfitz/daff/releases)
|
178
|
+
page. Supported languages ------------------- The `daff` library is written in
|
179
|
+
[Haxe](http://haxe.org/), which can be translated reasonably well into at least
|
180
|
+
the following languages: * Javascript * PHP * Python * Java * C# * C++ * (via a
|
181
|
+
hack, just for `daff`) Ruby Some translations are done for you on the [Releases](https://github.com/paulfitz/daff/releases)
|
182
|
+
page. To make another translation, follow the [Haxe getting started tutorial](http://haxe.org/doc/start)
|
165
183
|
for the language you care about, then do one of: ``` make js make php make py make
|
166
184
|
java make cs make cpp ``` [@Floppy](https://github.com/Floppy) has made a lovingly-hand-written
|
167
185
|
[native Ruby port](https://github.com/theodi/coopy-ruby) that covers core functionality. I''ve
|
168
|
-
made a brutally-machine-converted
|
169
|
-
|
170
|
-
|
171
|
-
|
172
|
-
|
173
|
-
|
174
|
-
of the diff format we use. * http://theodi.org/blog/csvhub-github-diffs-for-csv-files
|
186
|
+
made a brutally-machine-converted port that is a full translation but less idiomatic. For
|
187
|
+
each language, the `daff` library expects to be handed an interface to tables you
|
188
|
+
create, rather than creating them itself. This is to avoid inefficient copies from
|
189
|
+
one format to another. You''ll find a `SimpleTable` class you can use if you find
|
190
|
+
this awkward. Reading material ---------------- * http://dataprotocols.org/tabular-diff-format/
|
191
|
+
: a specification of the diff format we use. * http://theodi.org/blog/csvhub-github-diffs-for-csv-files
|
175
192
|
: using this library with github. * http://theodi.org/blog/adapting-git-simple-data
|
176
193
|
: using this library with gitlab. * http://okfnlabs.org/blog/2013/08/08/diffing-and-patching-data.html
|
177
194
|
: a summary of where the library came from. * http://blog.okfn.org/2013/07/02/git-and-github-for-data/
|