RubyGems - xmlcodec - Versions diffs - 0.1.2 → 0.1.3 - Mend

xmlcodec 0.1.2 → 0.1.3

Files changed (6) hide show

data/README.rdoc ADDED Viewed

@@ -0,0 +1,123 @@
+= xmlcodec
+This is a framework to create importers/exporters of XML formats into Ruby objects. To create a new importer/exporter all you have to do is create a simple ruby class for each of the XML elements. This then gives you four main API interactions for free, all using the same objects:
+* Create a tree of ruby objects and export it as XML
+* Import a full XML document as a ruby tree of objects
+* Stream parse a XML document with events for elements as Ruby objects
+* Create unlimited sized XML documents with constant memory usage by partially writing out the XML at the same time the in-memory tree is being created.
+The first two API's handle full trees at all times. The stream parser allows you to parse a very big XML file as a stream like a SAX parser but receiving fully-formed Ruby objects as events so as to use the same object APIs without ever having the full tree in memory. The partial export API allows you to create huge XML files the same way you'd create a small one (by putting elements in the Ruby tree) but without having to create the whole tree in memory at any one time.
+This project was created as an extract of work done at {Arquivo Nacional da Torre do Tombo}[http://antt.dgarq.gov.pt/].
+== Usage
+To create an importer exporter for this XML format:
+  <root>
+    <firstelement>
+      <secondelement firstattr='1'>
+        some value
+      </secondelement>
+      <secondelement firstattr='2'>
+        some other value
+      </secondelement>
+    </firstelement>
+  </root>
+you would create the following classes:
+  require 'xmlcodec'
+  class Root < XMLCodec::XMLElement
+    elname 'root'
+    xmlsubel :firstelement
+  end
+  class FirstElement < XMLCodec::XMLElement
+    elname 'firstelement'
+    xmlsubel_mult :secondelement
+  end
+  class SecondElement < XMLCodec::XMLElement
+    elname 'secondelement'
+    elwithvalue
+    xmlattr :firstattr
+  end
+elname defines the name of the element in the XML DOM. xmlsubel defines a
+subelement that may exist only once. xmlsubel_mult defines a subelement that may
+appear several times. xmlattr defines an attribute for the element. The classes
+will respond to accessor methods with the names of the subelements and
+attributes.
+There is one more way to declare subelements:
+  class SomeOtherElement
+    elname 'stuff'
+    xmlsubelements
+  end
+This one defines an element that can have a bunch of elements of different types
+whose order is important. The class will have a #subelements method that gives
+access to a container with the collection of the elements.
+This is all you have to define to implement the importer/exporter for the
+format.
+To import XML just do:
+  # From text
+  Root.import_xml_text(File.new('file.xml'))
+  # From a REXML DOM
+  Root.import_xml(REXML::Document.new(File.new('file.xml')))
+To export do:
+  # To generate XML text
+  string = some_element.xml_text
+  # To generate REXML DOM
+  doc = some_element.create_xml(REXML::Document.new)
+All these calls require keeping the whole contents of the document in memory.
+The ones that use the REXML DOM will have it twice. To handle large documents with constant memory usage another set of APIs is available.
+To stream parse a large document you'd do something like:
+  class MyStreamListener
+    def el_secondelement(el)
+      obj = el.get_object
+      ... do something with obj ...
+      # To remove it from the stream so the parent
+      # doesn't include it and memory is freed.
+      el.consume
+    end
+  end
+  parser = XMLStreamObjectParser.new(MyStreamListener.new)
+  parser.parse(some_string_or_file)
+You can define as many listening methods as elements you'd like to listen to and by doing el.consume the element is not kept around and memory is freed. Note that when you consume an element it will not be part of the parent when that event comes around.
+To produce very large XML files with constant memory usage you would do something like:
+  file = File.new('somefile.xml')
+  fe = FirstElement.new
+  10000.times do |i|
+    se = SecondElement.new(i)
+    fe.secondelement << se
+    se.partial_export(file)
+  end
+  fe.end_partial_export(file)
+Here 10000 instances of <secondelement> where written to the file. Because we did the partial_export calls inside the loop, each instance was written to file and removed from the parent so at any one point we only have one instance of FirstElement and SecondElement in memory. Besides the calls to the partial_export methods all the code is the same you'd use to create the tree in memory.
+== Author
+Pedro Côrte-Real <pedro@pedrocr.net>

data/Rakefile CHANGED Viewed

@@ -1,5 +1,5 @@
 PKG_NAME = 'xmlcodec'
-PKG_VERSION = '0.1.2'
+PKG_VERSION = '0.1.3'
 require 'rake'
 require 'rake/testtask'
@@ -20,7 +20,7 @@ PKG_FILES = FileList[TEST_FILES,
                      'Rakefile']
 RDOC_OPTIONS = ['-S', '-w 2', '-N']
-RDOC_EXTRA_FILES = ['README']
+RDOC_EXTRA_FILES = ['README.rdoc']
 spec = Gem::Specification.new do |s|
   s.platform = Gem::Platform::RUBY
@@ -50,10 +50,10 @@ Rake::TestTask.new do |t|
 end
 Rake::RDocTask.new do |rd|
-  rd.main = "README"
+  rd.main = "README.rdoc"
   rd.name = :docs
   rd.rdoc_files.include(RDOC_EXTRA_FILES, CODE_FILES)
-  rd.rdoc_dir = 'web/doc'
+  rd.rdoc_dir = 'doc'
   rd.title = "#{PKG_NAME} API"
   rd.options = RDOC_OPTIONS
 end

data/lib/XMLUtils.rb CHANGED Viewed

@@ -60,10 +60,13 @@ module XMLUtils
   # Gets the xpath inside a given document that can either be a string or a
   # REXML::Document
   #
-  # opts can have:
-  #   :multiple: fetch all the occurences of the xpath
-  #   :with_attrs: include the attribute contents in the result
-  #   :recursive: recursively include all the subelements of the matches
+  # Supported options (boolean):
+  # [:multiple]
+  #   fetch all the occurences of the xpath
+  # [:with_attrs]
+  #   include the attribute contents in the result
+  # [:recursive]
+  #   recursively include all the subelements of the matches
   def self.get_xpath(path, doc, opts={})
     if doc.is_a? REXML::Document
       doc = doc

data/test/test_helper.rb CHANGED Viewed

@@ -14,16 +14,14 @@ class Test::Unit::TestCase
   end
   def validate_well_formed
-    filename = filename || @temp_path
     assert(system("xmllint --version > /dev/null 2>&1"),
            "xmllint utility not installed"+
            "(on ubuntu/debian install package libxml2-utils)")
-		assert(system("xmllint #{filename} >/dev/null"),
-           "Validation failed for #{filename}")
+		assert(system("xmllint #{@temp_path} >/dev/null"),
+           "Validation failed for #{@temp_path}")
 	end
   def compare_xpath(value, path)
-    filename = filename || @temp_path
-		assert_equal(value.strip, XMLUtils::select_path(path, filename).strip)
+		assert_equal(value.strip, XMLUtils::select_path(path, @temp_path).strip)
 	end
 end

metadata CHANGED Viewed

@@ -1,13 +1,13 @@
 --- !ruby/object:Gem::Specification
 name: xmlcodec
 version: !ruby/object:Gem::Version
-  hash: 31
+  hash: 29
   prerelease: false
   segments:
   - 0
   - 1
-  - 2
-  version: 0.1.2
+  - 3
+  version: 0.1.3
 platform: ruby
 authors:
 - "Pedro C\xC3\xB4rte-Real"
@@ -28,7 +28,7 @@ executables: []
 extensions: []
 extra_rdoc_files:
-- README
+- README.rdoc
 files:
 - test/partial_export_test.rb
 - test/subelements_test.rb
@@ -45,7 +45,7 @@ files:
 - lib/xmlcodec.rb
 - lib/stream_object_parser.rb
 - lib/stream_parser.rb
-- README
+- README.rdoc
 - LICENSE
 - Rakefile
 has_rdoc: true

data/README DELETED Viewed

@@ -1,81 +0,0 @@
-This is a library that helps create Ruby importers/exporters of XML. The core
-of it is XMLCodec::XMLElement. To create an importer exporter for this XML
-format:
-  <root>
-    <firstelement>
-      <secondelement firstattr='1'>
-        some value
-      </secondelement>
-      <secondelement firstattr='2'>
-        some other value
-      </secondelement>
-    </firstelement>
-  </root>
-you'd create the following classes
-  require 'xmlcodec'
-  class Root < XMLCodec::XMLElement
-    elname 'root'
-    xmlsubel :firstelement
-  end
-  class FirstElement < XMLCodec::XMLElement
-    elname 'firstelement'
-    xmlsubel_mult :secondelement
-  end
-  class SecondElement < XMLCodec::XMLElement
-    elname 'secondelement'
-    elwithvalue
-    xmlattr :firstattr
-  end
-elname defines the name of the element in the XML DOM. xmlsubel defines a
-subelement that may exist only once. xmlsubel_mult defines a subelement that may
-appear several times. xmlattr defines an attribute for the element. The classes
-will respond to accessor methods with the names of the subelements and
-attributes.
-There is one more way to declare subelements:
-  class SomeOtherElement
-    elname 'stuff'
-    xmlsubelements
-  end
-This one defines an element that can have a bunch of elements of different types
-whose order is important. The class will have a #subelements method that gives
-access to a container with the collection of the elements.
-This is all you have to define to implement the importer/exporter for the
-format.
-To import from a file just do:
-  Root.import_xml_text(File.new('somefilename.xml'))
-or from a REXML DOM:
-  Root.import_xml(REXML::Document.new(File.new('somefilename.xml')))
-To export into a REXML DOM Document or Element do:
-  somerootelement.create_xml(REXML::Document.new)
-or to some XML text:
-  text = somerootelement.xml_text
-All these calls require keeping the whole contents of the document in memory.
-The ones that use the REXML DOM will have it twice. To handle large documents
-with constant memory usage you should try importing with
-XMLCodec::XMLStreamObjectParser and exporting with
-XMLCodec::XMLElement#partial_export.
-Author:
-  Pedro Côrte-Real
-  <pedro@pedrocr.net>