diffxml 0.2.1 → 0.3.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/README.md +15 -4
- data/diffxml.gemspec +1 -1
- data/lib/diffxml.rb +12 -1
- data/rspec/diffXML_spec.rb +1 -1
- data/rspec/wikimediaxml_test.xml +4138 -0
- metadata +3 -2
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 0bc5b49cdc00aac5a186f09b7bed579aeedb2524
|
4
|
+
data.tar.gz: f599cb2acaab4a4b6ce4272e07ac0186248edbbd
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 2679443da271df1d7163e534d79cca89a9af9b11f089b28affa9af4ee166fe5c75fefe914c0ea52444236ef9681b7dd97f3f79eda98f4941bcb73cfa8d626020
|
7
|
+
data.tar.gz: 97efd9818fb4986b6b4c05365a50532bd71719a0328cdddc50652751abcba447e6f0a0bd1f84070561d0681e6abc5b4d33271c3416be2ba932420da2f3a34707
|
data/README.md
CHANGED
@@ -35,20 +35,31 @@ DiffXML.compareXML(doc1, doc2)
|
|
35
35
|
```
|
36
36
|
the returned value will be an array with the XPaths of all nodes that were not matched.
|
37
37
|
|
38
|
+
If you are interested in ignoring specific children you can pass in css selectors or XPaths with the same method
|
39
|
+
CSS with:
|
40
|
+
```ruby
|
41
|
+
DiffXML.compareXML(doc1, doc2, ArrayOfIgnores, true)
|
42
|
+
```
|
43
|
+
XPath with:
|
44
|
+
```ruby
|
45
|
+
DiffXML.compareXML(doc1, doc2, ArrayOfIgnores)
|
46
|
+
```
|
47
|
+
|
38
48
|
## To Do
|
39
49
|
* Plans to return the values of both nodes that are at the XPath in the array, as well as the XPath location are in the works.
|
40
50
|
* General upkeep and a more rigorous test set are also planned.
|
41
51
|
* RDoc implementation for documentation.
|
42
52
|
* optimize searches: the memory handling has been improved, however, the search still does not differentiate between nodes with the same path, meaning xmls in different orders may report false negatives(untested)
|
43
53
|
because it just compares the string of the node set as opposed to comparing each node in the set individually.
|
44
|
-
* Add
|
45
|
-
* Refactor Utility methods into seperate file
|
54
|
+
* Add attribute comparison
|
46
55
|
|
47
56
|
## Known Issues
|
48
|
-
* ~~With large XML documents, specifically with tested documents over 1500 elements, but possibly fewer, the gem will reach a point where it cannot allocate memory.~~
|
49
|
-
* Fixed in latest commit, will be applied with version 0.2.0 release, optimization of the compare is still needed. HUGE increase in speed when only collecting namespaces once 5500 seconds to 55 seconds!
|
50
57
|
|
51
58
|
## Contributing
|
52
59
|
|
53
60
|
Bug reports and pull requests are welcome on GitHub at https://github.com/pbubnar/diffxml.
|
54
61
|
|
62
|
+
## Sources
|
63
|
+
|
64
|
+
The test xml data for the wikimedia xml came from http://dumps.wikimedia.your.org/backups-of-old-wikis.html
|
65
|
+
|
data/diffxml.gemspec
CHANGED
data/lib/diffxml.rb
CHANGED
@@ -3,13 +3,24 @@ require 'DiffXML/utils'
|
|
3
3
|
|
4
4
|
module DiffXML
|
5
5
|
@xpathArray = []
|
6
|
-
def self.compareXML(doc1, doc2)
|
6
|
+
def self.compareXML(doc1, doc2, ignores = [], css = false)
|
7
|
+
raise "Expected true or false for css parameter, got #{css}" unless css.is_a? TrueClass or css.is_a? FalseClass
|
7
8
|
@namespaces = doc1.collect_namespaces
|
9
|
+
cssClass = Nokogiri::CSS
|
8
10
|
if doc1.class == Nokogiri::XML::Document
|
9
11
|
DiffXML::Utils.collectXPaths(doc1.root, @xpathArray)
|
10
12
|
else
|
11
13
|
DiffXML::Utils.collectXPaths(doc1, @xpathArray)
|
12
14
|
end
|
15
|
+
if !ignores.empty?
|
16
|
+
ignores.each do |ignore|
|
17
|
+
if css
|
18
|
+
@xpathArray.delete(cssClass.xpath_for(ignore)[0])
|
19
|
+
else
|
20
|
+
@xpathArray.delete(ignore)
|
21
|
+
end
|
22
|
+
end
|
23
|
+
end
|
13
24
|
@xpathArray.delete_if do |element|
|
14
25
|
compareToPath(element, doc1, doc2)
|
15
26
|
end
|
data/rspec/diffXML_spec.rb
CHANGED
@@ -29,7 +29,7 @@ describe DiffXML do
|
|
29
29
|
end
|
30
30
|
|
31
31
|
it 'should retrieve and compare a node from a second document using a Path' do
|
32
|
-
expect(DiffXML
|
32
|
+
expect(DiffXML.compareToPath('doc/first',xml1,xml2)).to eql true
|
33
33
|
end
|
34
34
|
|
35
35
|
it 'should go through 2 XMLs removing XPaths from the array as they are found' do
|