rdf-raptor 0.4.0 → 0.4.1

Sign up to get free protection for your applications and to get access to all the features.
File without changes
data/README CHANGED
@@ -1,5 +1,5 @@
1
- Raptor RDF Parser Wrapper for RDF.rb
2
- ====================================
1
+ Raptor RDF Parser Plugin for RDF.rb
2
+ ===================================
3
3
 
4
4
  This is an [RDF.rb][] plugin that adds support for parsing/serializing
5
5
  [RDF/XML][], [Turtle][] and [RDFa][] data using the [Raptor RDF Parser][Raptor]
@@ -11,8 +11,7 @@ library.
11
11
  Features
12
12
  --------
13
13
 
14
- * Requires the [Raptor][] library and utilities to be available.
15
- * Based on the [`rapper`][rapper] command-line utility bundled with Raptor.
14
+ * Requires the [Raptor][] library and/or command-line utilities.
16
15
  * Parses and serializes RDF data from/into the RDF/XML or Turtle formats.
17
16
  * Extracts RDF statements from XHTML+RDFa documents.
18
17
  * Provides serialization format autodetection for RDF/XML, Turtle and RDFa.
@@ -103,16 +102,18 @@ Documentation
103
102
  <http://rdf.rubyforge.org/raptor/>
104
103
 
105
104
  * {RDF::Raptor}
106
- * {RDF::Raptor::RDFXML}
105
+ * {RDF::Raptor::NTriples}
107
106
  * {RDF::Raptor::Turtle}
107
+ * {RDF::Raptor::RDFXML}
108
108
  * {RDF::Raptor::RDFa}
109
109
  * {RDF::Raptor::Graphviz}
110
110
 
111
111
  Dependencies
112
112
  ------------
113
113
 
114
- * [RDF.rb](http://rubygems.org/gems/rdf) (>= 0.2.0)
115
- * [Raptor][] (>= 1.4.16), specifically the `rapper` binary
114
+ * [RDF.rb](http://rubygems.org/gems/rdf) (>= 0.3.0)
115
+ * [FFI](http://rubygems.org/gems/ffi) (>= 1.0.0)
116
+ * [Raptor][] (>= 1.4.16), the `libraptor` library or the `rapper` binary
116
117
 
117
118
  Installation
118
119
  ------------
@@ -124,8 +125,8 @@ To install the latest official release of the `RDF::Raptor` gem, do:
124
125
 
125
126
  To install the required [Raptor][] command-line tools themselves, look for a
126
127
  `raptor` or `raptor-utils` package in your platform's package management
127
- system. Here follow installation instructions for the Mac and the most
128
- common Linux and BSD distributions:
128
+ system. For your convenience, here follow installation instructions for the
129
+ Mac and the most common Linux and BSD distributions:
129
130
 
130
131
  % [sudo] port install raptor # Mac OS X with MacPorts
131
132
  % [sudo] fink install raptor-bin # Mac OS X with Fink
@@ -144,27 +145,32 @@ To get a local working copy of the development repository, do:
144
145
 
145
146
  % git clone git://github.com/bendiken/rdf-raptor.git
146
147
 
147
- Alternatively, you can download the latest development version as a tarball
148
- as follows:
148
+ Alternatively, download the latest development version as a tarball as
149
+ follows:
149
150
 
150
151
  % wget http://github.com/bendiken/rdf-raptor/tarball/master
151
152
 
152
- Author
153
- ------
153
+ Mailing List
154
+ ------------
155
+
156
+ * <http://lists.w3.org/Archives/Public/public-rdf-ruby/>
157
+
158
+ Authors
159
+ -------
154
160
 
155
- * [Arto Bendiken](mailto:arto.bendiken@gmail.com) - <http://ar.to/>
156
- * [John Fieber](mailto:jrf@ursamaris.org) - <http://github.com/jfieber>
161
+ * [Arto Bendiken](http://github.com/bendiken) - <http://ar.to/>
162
+ * [John Fieber](http://github.com/jfieber) - <http://github.com/jfieber>
157
163
 
158
164
  Contributors
159
165
  ------------
160
166
 
161
- * [Ben Lavender](mailto:blavender@gmail.com) - <http://bhuga.net/>
167
+ * [Ben Lavender](http://github.com/bhuga) - <http://bhuga.net/>
162
168
 
163
169
  License
164
170
  -------
165
171
 
166
- `RDF::Raptor` is free and unencumbered public domain software. For more
167
- information, see <http://unlicense.org/> or the accompanying UNLICENSE file.
172
+ This is free and unencumbered public domain software. For more information,
173
+ see <http://unlicense.org/> or the accompanying {file:UNLICENSE} file.
168
174
 
169
175
  [RDF.rb]: http://rdf.rubyforge.org/
170
176
  [RDF/XML]: http://www.w3.org/TR/REC-rdf-syntax/
data/VERSION CHANGED
@@ -1 +1 @@
1
- 0.4.0
1
+ 0.4.1
@@ -9,7 +9,7 @@
9
9
  doap:name "RDF::Raptor" ;
10
10
  doap:homepage <http://rdf.rubyforge.org/raptor/> ;
11
11
  doap:license <http://creativecommons.org/licenses/publicdomain/> ;
12
- doap:shortdesc "Raptor RDF Parser wrapper for RDF.rb."@en ;
12
+ doap:shortdesc "Raptor RDF Parser plugin for RDF.rb."@en ;
13
13
  doap:description "RDF.rb plugin for parsing/serializing RDF/XML, Turtle and RDFa data using the Raptor RDF Parser library."@en ;
14
14
  doap:created "2010-03-23" ;
15
15
  doap:platform "Ruby" ;
@@ -1,14 +1,15 @@
1
- require 'tempfile'
2
- require 'rdf'
1
+ require 'rdf' # @see http://rubygems.org/gems/rdf
3
2
 
4
3
  module RDF
5
4
  ##
6
- # **`RDF::Raptor`** is a Raptor RDF Parser wrapper for RDF.rb.
5
+ # **`RDF::Raptor`** is a Raptor RDF Parser plugin for RDF.rb.
7
6
  #
8
- # * {RDF::Raptor::RDFXML} provides support for the standard
9
- # machine-readable RDF/XML format.
7
+ # * {RDF::Raptor::NTriples} provides support for the standard
8
+ # machine-readable N-Triples format.
10
9
  # * {RDF::Raptor::Turtle} provides support for the popular
11
10
  # human-readable Turtle format.
11
+ # * {RDF::Raptor::RDFXML} provides support for the standard
12
+ # machine-readable RDF/XML format.
12
13
  # * {RDF::Raptor::RDFa} provides support for extracting
13
14
  # RDF statements from XHTML+RDFa documents.
14
15
  # * {RDF::Raptor::Graphviz} provides support for serializing
@@ -26,12 +27,12 @@ module RDF
26
27
  # @example Obtaining the Raptor engine
27
28
  # RDF::Raptor::ENGINE #=> :ffi
28
29
  #
29
- # @example Obtaining an RDF/XML format class
30
- # RDF::Format.for(:rdfxml) #=> RDF::Raptor::RDFXML::Format
31
- # RDF::Format.for("input.rdf")
32
- # RDF::Format.for(:file_name => "input.rdf")
33
- # RDF::Format.for(:file_extension => "rdf")
34
- # RDF::Format.for(:content_type => "application/rdf+xml")
30
+ # @example Obtaining an N-Triples format class
31
+ # RDF::Format.for(:ntriples) #=> RDF::Raptor::NTriples::Format
32
+ # RDF::Format.for("input.nt")
33
+ # RDF::Format.for(:file_name => "input.nt")
34
+ # RDF::Format.for(:file_extension => "nt")
35
+ # RDF::Format.for(:content_type => "text/plain")
35
36
  #
36
37
  # @example Obtaining a Turtle format class
37
38
  # RDF::Format.for(:turtle) #=> RDF::Raptor::Turtle::Format
@@ -40,6 +41,13 @@ module RDF
40
41
  # RDF::Format.for(:file_extension => "ttl")
41
42
  # RDF::Format.for(:content_type => "text/turtle")
42
43
  #
44
+ # @example Obtaining an RDF/XML format class
45
+ # RDF::Format.for(:rdfxml) #=> RDF::Raptor::RDFXML::Format
46
+ # RDF::Format.for("input.rdf")
47
+ # RDF::Format.for(:file_name => "input.rdf")
48
+ # RDF::Format.for(:file_extension => "rdf")
49
+ # RDF::Format.for(:content_type => "application/rdf+xml")
50
+ #
43
51
  # @example Obtaining an RDFa format class
44
52
  # RDF::Format.for(:rdfa) #=> RDF::Raptor::RDFa::Format
45
53
  # RDF::Format.for("input.html")
@@ -47,24 +55,27 @@ module RDF
47
55
  # RDF::Format.for(:file_extension => "html")
48
56
  # RDF::Format.for(:content_type => "application/xhtml+xml")
49
57
  #
50
- # {RDF::Raptor} includes an ffi implementation, which loads
51
- # the libraptor library into the ruby process, and a cli
52
- # implementation, which uses the rapper command line tool
53
- # in a subprocess. The ffi implementation is used unless
54
- # libraptor library is not found, or the RDF_RAPTOR_ENGINE
55
- # environment variable is set to 'cli'.
58
+ # {RDF::Raptor} includes an FFI implementation, which loads the
59
+ # `libraptor` library into the Ruby process, as well as a CLI
60
+ # implementation, which drives the `rapper` command-line tool in a
61
+ # sub-process.
62
+ #
63
+ # The FFI implementation is used by default unless the `libraptor` library
64
+ # cannot be found, or if the `RDF_RAPTOR_ENGINE` environment variable is
65
+ # explicitly set to `'cli'`.
56
66
  #
57
- # If the libraptor library is in the standard library search
58
- # path, and the rapper command is in the standard command
59
- # search path, all should be well. If either is in a
60
- # non-standard location, set the RDF_RAPTOR_LIBPATH and/or
61
- # RDF_RAPTOR_BINPATH appropriately before requiring rdf/raptor.
67
+ # If the `libraptor` library is in the standard library search path, and
68
+ # the `rapper` command is in the standard command search path, all should
69
+ # be well and work fine out of the box. However, if either is in a
70
+ # non-standard location, be sure to set the `RDF_RAPTOR_LIBPATH` and/or
71
+ # `RDF_RAPTOR_BINPATH` environment variables appropriately before
72
+ # requiring `rdf/raptor`.
62
73
  #
63
74
  # @see http://rdf.rubyforge.org/
64
75
  # @see http://librdf.org/raptor/
65
76
  # @see http://wiki.github.com/ffi/ffi/
66
77
  #
67
- # @author [Arto Bendiken](http://ar.to/)
78
+ # @author [Arto Bendiken](http://github.com/bendiken)
68
79
  # @author [John Fieber](http://github.com/jfieber)
69
80
  module Raptor
70
81
  LIBRAPTOR = ENV['RDF_RAPTOR_LIBPATH'] || 'libraptor' unless const_defined?(:LIBRAPTOR)
@@ -72,13 +83,13 @@ module RDF
72
83
 
73
84
  require 'rdf/raptor/version'
74
85
  begin
75
- # Try ffi implementation
86
+ # Try FFI implementation
76
87
  raise LoadError if ENV['RDF_RAPTOR_ENGINE'] == 'cli' # override
77
88
  require 'rdf/raptor/ffi'
78
89
  include RDF::Raptor::FFI
79
90
  extend RDF::Raptor::FFI
80
91
  rescue LoadError => e
81
- # cli fallback
92
+ # CLI fallback
82
93
  require 'rdf/raptor/cli'
83
94
  include RDF::Raptor::CLI
84
95
  extend RDF::Raptor::CLI
@@ -112,11 +123,12 @@ module RDF
112
123
  @rapper_format = format
113
124
  end
114
125
  end
115
- end
126
+ end # Format
116
127
 
117
- require 'rdf/raptor/rdfxml'
128
+ #require 'rdf/raptor/ntriples'
118
129
  require 'rdf/raptor/turtle'
130
+ require 'rdf/raptor/rdfxml'
119
131
  require 'rdf/raptor/rdfa'
120
132
  require 'rdf/raptor/graphviz'
121
- end # module Raptor
122
- end # module RDF
133
+ end # Raptor
134
+ end # RDF
@@ -1,8 +1,9 @@
1
+ require 'tempfile'
2
+
1
3
  module RDF::Raptor
2
4
  ##
3
5
  # A command-line interface to Raptor's `rapper` utility.
4
6
  module CLI
5
-
6
7
  ENGINE = :cli
7
8
 
8
9
  ##
@@ -18,18 +19,23 @@ module RDF::Raptor
18
19
  [$1, $2, $3].join('.')
19
20
  end
20
21
  end
22
+ module_function :version
21
23
 
22
24
  ##
23
- # Reader implementation.
25
+ # CLI reader implementation.
24
26
  class Reader < RDF::Reader
25
27
  ##
28
+ # Initializes the CLI reader instance.
29
+ #
26
30
  # @param [IO, File, RDF::URI, String] input
27
- # @param [Hash{Symbol => Object}] options
28
- # @option (options) [String, #to_s] :base_uri ("file:///dev/stdin")
29
- # @yield [reader]
30
- # @yieldparam [RDF::Reader] reader
31
+ # @param [Hash{Symbol => Object}] options
32
+ # any additional options (see `RDF::Reader#initialize`)
33
+ # @option options [String, #to_s] :base_uri ("file:///dev/stdin")
34
+ # @yield [reader] `self`
35
+ # @yieldparam [RDF::Reader] reader
36
+ # @yieldreturn [void] ignored
31
37
  def initialize(input = $stdin, options = {}, &block)
32
- raise RDF::ReaderError.new("`rapper` binary not found") unless RDF::Raptor.available?
38
+ raise RDF::ReaderError, "`rapper` binary not found" unless RDF::Raptor.available?
33
39
 
34
40
  format = self.class.format.rapper_format
35
41
  case input
@@ -37,16 +43,18 @@ module RDF::Raptor
37
43
  @command = "#{RAPPER} -q -i #{format} -o ntriples '#{input}'"
38
44
  @command << " '#{options[:base_uri]}'" if options.has_key?(:base_uri)
39
45
  @rapper = IO.popen(@command, 'rb')
46
+
40
47
  when File, Tempfile
41
48
  @command = "#{RAPPER} -q -i #{format} -o ntriples '#{File.expand_path(input.path)}'"
42
49
  @command << " '#{options[:base_uri]}'" if options.has_key?(:base_uri)
43
50
  @rapper = IO.popen(@command, 'rb')
51
+
44
52
  else # IO, String
45
53
  @command = "#{RAPPER} -q -i #{format} -o ntriples file:///dev/stdin"
46
54
  @command << " '#{options[:base_uri]}'" if options.has_key?(:base_uri)
47
55
  @rapper = IO.popen(@command, 'rb+')
48
56
  pid = fork do
49
- # process to feed rapper
57
+ # process to feed `rapper`
50
58
  begin
51
59
  @rapper.close_read
52
60
  if input.respond_to?(:read)
@@ -65,13 +73,23 @@ module RDF::Raptor
65
73
  Process.detach(pid)
66
74
  @rapper.close_write
67
75
  end
68
- @reader = RDF::NTriples::Reader.new(@rapper, options, &block)
76
+
77
+ @options = options
78
+ @reader = RDF::NTriples::Reader.new(@rapper, @options).extend(Extensions)
79
+
80
+ if block_given?
81
+ case block.arity
82
+ when 0 then instance_eval(&block)
83
+ else block.call(self)
84
+ end
85
+ end
69
86
  end
70
87
 
71
- protected
88
+ protected
72
89
 
73
90
  ##
74
- # @return [Array]
91
+ # @return [Array(RDF::Resource, RDF::URI, RDF::Term)]
92
+ # @see RDF::Reader#read_triple
75
93
  def read_triple
76
94
  raise EOFError if @rapper.closed?
77
95
  begin
@@ -83,18 +101,43 @@ module RDF::Raptor
83
101
  triple
84
102
  end
85
103
 
86
- end
104
+ ##
105
+ # Extensions for `RDF::NTriples::Reader`.
106
+ module Extensions
107
+ NODEID = RDF::NTriples::Reader::NODEID
108
+ GENID = /^genid\d+$/
109
+
110
+ ##
111
+ # Generates fresh random identifiers for Raptor's `_:genid[0-9]+`
112
+ # blank nodes, while preserving any user-specified blank node
113
+ # identifiers verbatim.
114
+ #
115
+ # @private
116
+ # @see RDF::NTriples::Reader#read_node
117
+ # @see https://github.com/bendiken/rdf-raptor/issues/#issue/9
118
+ def read_node
119
+ if node_id = match(NODEID)
120
+ @nodes ||= {}
121
+ @nodes[node_id] ||= RDF::Node.new(GENID === node_id ? nil : node_id)
122
+ end
123
+ end
124
+ end
125
+ end # Reader
87
126
 
88
127
  ##
89
- # Writer implementation.
128
+ # CLI writer implementation.
90
129
  class Writer < RDF::Writer
91
130
  ##
131
+ # Initializes the CLI writer instance.
132
+ #
92
133
  # @param [IO, File] output
93
134
  # @param [Hash{Symbol => Object}] options
94
- # @yield [writer]
95
- # @yieldparam [RDF::Writer] writer
135
+ # any additional options (see `RDF::Writer#initialize`)
136
+ # @yield [writer] `self`
137
+ # @yieldparam [RDF::Writer] writer
138
+ # @yieldreturn [void]
96
139
  def initialize(output = $stdout, options = {}, &block)
97
- raise RDF::WriterError.new("`rapper` binary not found") unless RDF::Raptor.available?
140
+ raise RDF::WriterError, "`rapper` binary not found" unless RDF::Raptor.available?
98
141
 
99
142
  format = self.class.format.rapper_format
100
143
  case output
@@ -103,16 +146,17 @@ module RDF::Raptor
103
146
  @command << " '#{options[:base_uri]}'" if options.has_key?(:base_uri)
104
147
  @rapper = IO.popen(@command, 'rb+')
105
148
  else
106
- raise ArgumentError.new("unsupported output type: #{output.inspect}")
149
+ raise ArgumentError, "unsupported output type: #{output.inspect}"
107
150
  end
108
151
  @writer = RDF::NTriples::Writer.new(@rapper, options)
109
152
  super(output, options, &block)
110
153
  end
111
154
 
112
- protected
155
+ protected
113
156
 
114
157
  ##
115
158
  # @return [void]
159
+ # @see RDF::Writer#write_prologue
116
160
  def write_prologue
117
161
  super
118
162
  end
@@ -120,8 +164,9 @@ module RDF::Raptor
120
164
  ##
121
165
  # @param [RDF::Resource] subject
122
166
  # @param [RDF::URI] predicate
123
- # @param [RDF::Value] object
167
+ # @param [RDF::Term] object
124
168
  # @return [void]
169
+ # @see RDF::Writer#write_triple
125
170
  def write_triple(subject, predicate, object)
126
171
  output_transit(false)
127
172
  @writer.write_triple(subject, predicate, object)
@@ -130,20 +175,23 @@ module RDF::Raptor
130
175
 
131
176
  ##
132
177
  # @return [void]
178
+ # @see RDF::Writer#write_epilogue
133
179
  def write_epilogue
134
180
  @rapper.close_write unless @rapper.closed?
135
181
  output_transit(true)
136
182
  end
137
183
 
138
184
  ##
139
- # Feed any available rapper output to the destination.
185
+ # Feeds any available `rapper` output to the destination.
186
+ #
187
+ # @param [Boolean] may_block
140
188
  # @return [void]
141
- def output_transit(block)
189
+ def output_transit(may_block)
142
190
  unless @rapper.closed?
143
191
  chunk_size = @options[:chunk_size] || 4096 # bytes
144
192
  begin
145
193
  loop do
146
- @output.write(block ? @rapper.readpartial(chunk_size) : @rapper.read_nonblock(chunk_size))
194
+ @output.write(may_block ? @rapper.readpartial(chunk_size) : @rapper.read_nonblock(chunk_size))
147
195
  end
148
196
  rescue EOFError => e
149
197
  @rapper.close
@@ -152,7 +200,6 @@ module RDF::Raptor
152
200
  end
153
201
  end
154
202
  end
155
-
156
- end
157
- end
158
- end
203
+ end # Writer
204
+ end # CLI
205
+ end # RDF::Raptor
@@ -1,4 +1,5 @@
1
- require 'ffi'
1
+ require 'tempfile'
2
+ require 'ffi' # @see http://rubygems.org/gems/ffi
2
3
 
3
4
  module RDF::Raptor
4
5
  ##
@@ -7,116 +8,150 @@ module RDF::Raptor
7
8
  # @see http://librdf.org/raptor/api/
8
9
  # @see http://librdf.org/raptor/libraptor.html
9
10
  module FFI
11
+ autoload :V1, 'rdf/raptor/ffi/v1'
10
12
 
11
13
  ENGINE = :ffi
12
14
 
13
15
  ##
14
- # Returns the installed `rapper` version number, or `nil` if `rapper` is
15
- # not available.
16
+ # Returns the installed `libraptor` version number, or `nil` if
17
+ # `libraptor` is not available.
16
18
  #
17
19
  # @example
18
20
  # RDF::Raptor.version #=> "1.4.21"
19
21
  #
20
- # @return [String]
22
+ # @return [String] an "x.y.z" version string
21
23
  def version
22
- [ V1_4.raptor_version_major,
23
- V1_4.raptor_version_minor,
24
- V1_4.raptor_version_release ].join('.')
24
+ [V1.raptor_version_major,
25
+ V1.raptor_version_minor,
26
+ V1.raptor_version_release].join('.').freeze
25
27
  end
28
+ module_function :version
26
29
 
27
30
  ##
28
- # Reader implementation.
31
+ # FFI reader implementation.
29
32
  class Reader < RDF::Reader
30
33
  ##
34
+ # Initializes the FFI reader instance.
35
+ #
31
36
  # @param [IO, File, RDF::URI, String] input
32
- # @param [Hash{Symbol => Object}] options
33
- # @option (options) [String, #to_s] :base_uri ("file:///dev/stdin")
34
- # @yield [reader]
35
- # @yieldparam [RDF::Reader] reader
37
+ # @param [Hash{Symbol => Object}] options
38
+ # any additional options (see `RDF::Reader#initialize`)
39
+ # @option options [String, #to_s] :base_uri ("file:///dev/stdin")
40
+ # @yield [reader] `self`
41
+ # @yieldparam [RDF::Reader] reader
42
+ # @yieldreturn [void] ignored
36
43
  def initialize(input = $stdin, options = {}, &block)
37
44
  @format = self.class.format.rapper_format
45
+ @parser = V1::Parser.new(@format)
46
+ @parser.error_handler = ERROR_HANDLER
47
+ @parser.warning_handler = WARNING_HANDLER
38
48
  super
39
49
  end
40
50
 
41
51
  ERROR_HANDLER = Proc.new do |user_data, locator, message|
42
- line = V1_4.raptor_locator_line(locator)
52
+ line = V1.raptor_locator_line(locator)
43
53
  raise RDF::ReaderError, line > -1 ? "Line #{line}: #{message}" : message
44
54
  end
45
55
 
46
56
  WARNING_HANDLER = Proc.new do |user_data, locator, message|
47
- # line = V1_4.raptor_locator_line(locator)
57
+ # line = V1.raptor_locator_line(locator)
48
58
  # $stderr.puts line > -1 ? "Line #{line}: #{message}" : message
49
59
  end
50
60
 
61
+ ##
62
+ # The Raptor parser instance.
63
+ #
64
+ # @return [V1::Parser]
65
+ attr_reader :parser
51
66
 
52
67
  ##
53
68
  # @yield [statement]
54
- # @yieldparam [RDF::Statement] statement
55
- def each_statement(&block)
56
- each_triple do |triple|
57
- block.call(RDF::Statement.new(*triple))
69
+ # @yieldparam [RDF::Statement] statement
70
+ # @yieldreturn [void] ignored
71
+ # @see RDF::Reader#each_statement
72
+ def each_statement(options = {}, &block)
73
+ if block_given?
74
+ if options[:raw]
75
+ # this is up to an order of magnitude faster...
76
+ parse(@input) do |parser, statement|
77
+ block.call(V1::Statement.new(statement, self))
78
+ end
79
+ else
80
+ parse(@input) do |parser, statement|
81
+ block.call(V1::Statement.new(statement, self).to_rdf)
82
+ end
83
+ end
58
84
  end
85
+ enum_for(:each_statement, options)
59
86
  end
87
+ alias_method :each, :each_statement
60
88
 
61
89
  ##
62
90
  # @yield [triple]
63
- # @yieldparam [Array(RDF::Resource, RDF::URI, RDF::Value)] triple
91
+ # @yieldparam [Array(RDF::Resource, RDF::URI, RDF::Term)] triple
92
+ # @yieldreturn [void] ignored
93
+ # @see RDF::Reader#each_triple
64
94
  def each_triple(&block)
65
- statement_handler = Proc.new do |user_data, statement|
66
- triple = V1_4::Statement.new(statement).to_triple
67
- block.call(triple)
95
+ if block_given?
96
+ parse(@input) do |parser, statement|
97
+ block.call(V1::Statement.new(statement, self).to_triple)
98
+ end
68
99
  end
100
+ enum_for(:each_triple)
101
+ end
69
102
 
70
- V1_4.with_parser(:name => @format) do |parser|
71
- V1_4.raptor_set_error_handler(parser, nil, ERROR_HANDLER)
72
- V1_4.raptor_set_warning_handler(parser, nil, WARNING_HANDLER)
73
- V1_4.raptor_set_statement_handler(parser, nil, statement_handler)
74
- case @input
75
- when RDF::URI, %r(^(file|http|https|ftp)://)
76
- begin
77
- data_url = V1_4.raptor_new_uri(@input.to_s)
78
- base_uri = @options[:base_uri].to_s.empty? ? nil : V1_4.raptor_new_uri(@options[:base_uri].to_s)
79
- unless (result = V1_4.raptor_parse_uri(parser, data_url, base_uri)).zero?
80
- # TODO: error handling
81
- end
82
- ensure
83
- V1_4.raptor_free_uri(base_uri) if base_uri
84
- V1_4.raptor_free_uri(data_url) if data_url
85
- end
86
-
87
- when File, Tempfile
88
- begin
89
- data_url = V1_4.raptor_new_uri("file://#{File.expand_path(@input.path)}")
90
- base_uri = @options[:base_uri].to_s.empty? ? nil : V1_4.raptor_new_uri(@options[:base_uri].to_s)
91
- unless (result = V1_4.raptor_parse_file(parser, data_url, base_uri)).zero?
92
- # TODO: error handling
93
- end
94
- ensure
95
- V1_4.raptor_free_uri(base_uri) if base_uri
96
- V1_4.raptor_free_uri(data_url) if data_url
97
- end
103
+ ##
104
+ # @private
105
+ # @param [RDF::URI, File, Tempfile, IO, StringIO] input
106
+ # the input stream
107
+ # @yield [parser, statement]
108
+ # each statement in the input stream
109
+ # @yieldparam [FFI::Pointer] parser
110
+ # @yieldparam [FFI::Pointer] statement
111
+ # @yieldreturn [void] ignored
112
+ # @return [void]
113
+ def parse(input, &block)
114
+ @parser.parse(input, @options, &block)
115
+ end
98
116
 
99
- else # IO, String
100
- base_uri = (@options[:base_uri] || 'file:///dev/stdin').to_s
101
- unless (result = V1_4.raptor_start_parse(parser, base_uri)).zero?
102
- # TODO: error handling
103
- end
104
- # TODO: read in chunks instead of everything in one go:
105
- unless (result = V1_4.raptor_parse_chunk(parser, buffer = @input.read, buffer.size, 0)).zero?
106
- # TODO: error handling
107
- end
108
- V1_4.raptor_parse_chunk(parser, nil, 0, 1) # EOF
109
- end
110
- end
117
+ GENID = /^genid\d+$/
111
118
 
119
+ ##
120
+ # @param [String] uri_str
121
+ # @return [RDF::URI]
122
+ def create_uri(uri_str)
123
+ RDF::URI.intern(uri_str)
112
124
  end
113
125
 
114
- alias_method :each, :each_statement
115
- end
126
+ ##
127
+ # @param [String] node_id
128
+ # @return [RDF::Node]
129
+ def create_node(node_id)
130
+ @nodes ||= {}
131
+ @nodes[node_id] ||= RDF::Node.new(GENID === node_id ? nil : node_id)
132
+ end
133
+ end # Reader
116
134
 
117
135
  ##
118
- # Writer implementation.
136
+ # FFI writer implementation.
119
137
  class Writer < RDF::Writer
138
+ ##
139
+ # Initializes the FFI writer instance.
140
+ #
141
+ # @param [IO, File] output
142
+ # @param [Hash{Symbol => Object}] options
143
+ # any additional options (see `RDF::Writer#initialize`)
144
+ # @yield [writer] `self`
145
+ # @yieldparam [RDF::Writer] writer
146
+ # @yieldreturn [void] ignored
147
+ def initialize(output = $stdout, options = {}, &block)
148
+ @format = self.class.format.rapper_format
149
+ @serializer = V1::Serializer.new(@format)
150
+ @serializer.error_handler = ERROR_HANDLER
151
+ @serializer.warning_handler = WARNING_HANDLER
152
+ @serializer.start_to(output, options)
153
+ super
154
+ end
120
155
 
121
156
  ERROR_HANDLER = Proc.new do |user_data, locator, message|
122
157
  raise RDF::WriterError, message
@@ -126,423 +161,37 @@ module RDF::Raptor
126
161
  # $stderr.puts "warning"
127
162
  end
128
163
 
129
- def initialize(output = $stdout, options = {}, &block)
130
- raise ArgumentError, "Block required" unless block_given? # Can we work without this?
131
- @format = self.class.format.rapper_format
132
- begin
133
- # make a serializer
134
- @serializer = V1_4.raptor_new_serializer((@format || :rdfxml).to_s)
135
- raise RDF::WriterError, "raptor_new_serializer failed" if @serializer.nil?
136
- V1_4.raptor_serializer_set_error_handler(@serializer, nil, ERROR_HANDLER)
137
- V1_4.raptor_serializer_set_warning_handler(@serializer, nil, WARNING_HANDLER)
138
- base_uri = options[:base_uri].to_s.empty? ? nil : V1_4.raptor_new_uri(options[:base_uri].to_s)
139
-
140
- # make an iostream
141
- handler = V1_4::IOStreamHandler.new
142
- handler.rubyio = output
143
- raptor_iostream = V1_4.raptor_new_iostream_from_handler2(nil, handler)
144
-
145
- # connect the two
146
- unless V1_4.raptor_serialize_start_to_iostream(@serializer, base_uri, raptor_iostream).zero?
147
- raise RDF::WriterError, "raptor_serialize_start_to_iostream failed"
148
- end
149
- super
150
- ensure
151
- V1_4.raptor_free_iostream(raptor_iostream) if raptor_iostream
152
- V1_4.raptor_free_uri(base_uri) if base_uri
153
- V1_4.raptor_free_serializer(@serializer) if @serializer
154
- end
155
- end
164
+ ##
165
+ # The Raptor serializer instance.
166
+ #
167
+ # @return [V1::Serializer]
168
+ attr_reader :serializer
156
169
 
157
170
  ##
158
171
  # @param [RDF::Resource] subject
159
172
  # @param [RDF::URI] predicate
160
- # @param [RDF::Value] object
173
+ # @param [RDF::Term] object
161
174
  # @return [void]
175
+ # @see RDF::Writer#write_triple
162
176
  def write_triple(subject, predicate, object)
163
- raptor_statement = V1_4::Statement.new
164
- raptor_statement.subject = subject
165
- raptor_statement.predicate = predicate
166
- raptor_statement.object = object
167
- begin
168
- unless V1_4.raptor_serialize_statement(@serializer, raptor_statement.to_ptr).zero?
169
- raise RDF::WriterError, "raptor_serialize_statement failed"
170
- end
171
- ensure
172
- raptor_statement.release
173
- raptor_statement = nil
174
- end
177
+ @serializer.serialize_triple(subject, predicate, object)
175
178
  end
176
179
 
177
180
  ##
178
181
  # @return [void]
182
+ # @see RDF::Writer#write_epilogue
179
183
  def write_epilogue
180
- unless V1_4.raptor_serialize_end(@serializer).zero?
181
- raise RDF::WriterError, "raptor_serialize_end failed"
182
- end
184
+ @serializer.finish
183
185
  super
184
186
  end
185
-
186
- end
187
-
187
+ end # Writer
188
188
 
189
189
  ##
190
- # Helper methods for FFI modules.
191
- module Base
192
- def define_pointer(name)
193
- self.class.send(:define_method, name) { :pointer }
194
- end
195
- end
196
-
197
- ##
198
- # A foreign-function interface (FFI) to `libraptor` 1.4.x.
199
- #
200
- # @see http://librdf.org/raptor/libraptor.html
201
- module V1_4
202
-
203
- ##
204
- # @param [Hash{Symbol => Object}] options
205
- # @option (options) [String, #to_s] :name (:rdfxml)
206
- # @yield [parser]
207
- # @yieldparam [FFI::Pointer] parser
208
- # @return [void]
209
- def self.with_parser(options = {}, &block)
210
- begin
211
- parser = raptor_new_parser((options[:name] || :rdfxml).to_s)
212
- block.call(parser)
213
- ensure
214
- raptor_free_parser(parser) if parser
215
- end
216
- end
217
-
218
-
219
- extend Base
190
+ # @private
191
+ module LibC
220
192
  extend ::FFI::Library
221
- ffi_lib LIBRAPTOR
222
-
223
- # TODO: Ideally this would be an enum, but the JRuby FFI (as of
224
- # version 1.4.0) has problems with enums as part of structs:
225
- # `Unknown field type: #<FFI::Enum> (ArgumentError)`
226
- RAPTOR_IDENTIFIER_TYPE_RESOURCE = 1
227
- RAPTOR_IDENTIFIER_TYPE_ANONYMOUS = 2
228
- RAPTOR_IDENTIFIER_TYPE_LITERAL = 5
229
-
230
- # @see http://librdf.org/raptor/api/raptor-section-triples.html
231
- class Statement < ::FFI::Struct
232
- layout :subject, :pointer,
233
- :subject_type, :int,
234
- :predicate, :pointer,
235
- :predicate_type, :int,
236
- :object, :pointer,
237
- :object_type, :int,
238
- :object_literal_datatype, :pointer,
239
- :object_literal_language, :pointer
240
-
241
- def initialize(*args)
242
- super
243
- # Objects we need to keep a Ruby reference
244
- # to so they don't get garbage collected out from under
245
- # the C code we pass them to.
246
- @mp = {}
247
-
248
- # Raptor object references we we need to explicitly free
249
- # when release is called
250
- @raptor_uri_list = []
251
- end
252
-
253
- ##
254
- # Release raptor memory associated with this struct.
255
- # Use of the object after calling this will most likely
256
- # cause a crash. This is kind of ugly.
257
- def release
258
- if pointer.kind_of?(::FFI::MemoryPointer) && !pointer.null?
259
- pointer.free
260
- end
261
- while uri = @raptor_uri_list.pop
262
- V1_4.raptor_free_uri(uri) unless uri.nil? || uri.null?
263
- end
264
- end
265
-
266
- ##
267
- # @return [RDF::Resource]
268
- def subject
269
- @subject ||= case self[:subject_type]
270
- when RAPTOR_IDENTIFIER_TYPE_RESOURCE
271
- RDF::URI.intern(V1_4.raptor_uri_to_string(self[:subject]))
272
- when RAPTOR_IDENTIFIER_TYPE_ANONYMOUS
273
- RDF::Node.new(self[:subject].read_string)
274
- end
275
- end
276
-
277
- ##
278
- # Set the subject from an RDF::Resource
279
- # @param [RDF::Resource] value
280
- def subject=(resource)
281
- @subject = nil
282
- case resource
283
- when RDF::Node
284
- self[:subject] = @mp[:subject] = ::FFI::MemoryPointer.from_string(resource.id.to_s)
285
- self[:subject_type] = RAPTOR_IDENTIFIER_TYPE_ANONYMOUS
286
- when RDF::URI
287
- self[:subject] = @mp[:subject] = @raptor_uri_list.push(V1_4.raptor_new_uri(resource.to_s)).last
288
- self[:subject_type] = RAPTOR_IDENTIFIER_TYPE_RESOURCE
289
- else
290
- raise ArgumentError, "subject must be of kind RDF::Node or RDF::URI"
291
- end
292
- @subject = resource
293
- end
294
-
295
- ##
296
- # @return [String]
297
- def subject_as_string
298
- V1_4.raptor_statement_part_as_string(
299
- self[:subject],
300
- self[:subject_type],
301
- nil, nil)
302
- end
303
-
304
- ##
305
- # @return [RDF::URI]
306
- def predicate
307
- @predicate ||= case self[:predicate_type]
308
- when RAPTOR_IDENTIFIER_TYPE_RESOURCE
309
- RDF::URI.intern(V1_4.raptor_uri_to_string(self[:predicate]))
310
- end
311
- end
312
-
313
- ##
314
- # Set the predicate from an RDF::URI
315
- # @param [RDF::URI] value
316
- def predicate=(uri)
317
- @predicate = nil
318
- raise ArgumentError, "predicate must be a kind of RDF::URI" unless uri.kind_of?(RDF::URI)
319
- self[:predicate] = @raptor_uri_list.push(V1_4.raptor_new_uri(uri.to_s)).last
320
- self[:predicate_type] = RAPTOR_IDENTIFIER_TYPE_RESOURCE
321
- @predicate = uri
322
- end
323
-
324
- ##
325
- # @return [String]
326
- def predicate_as_string
327
- V1_4.raptor_statement_part_as_string(
328
- self[:predicate],
329
- self[:predicate_type],
330
- nil, nil)
331
- end
332
-
333
- ##
334
- # @return [RDF::Value]
335
- def object
336
- @object ||= case self[:object_type]
337
- when RAPTOR_IDENTIFIER_TYPE_RESOURCE
338
- RDF::URI.intern(V1_4.raptor_uri_to_string(self[:object]))
339
- when RAPTOR_IDENTIFIER_TYPE_ANONYMOUS
340
- RDF::Node.new(self[:object].read_string)
341
- when RAPTOR_IDENTIFIER_TYPE_LITERAL
342
- case
343
- when self[:object_literal_language] && !self[:object_literal_language].null?
344
- RDF::Literal.new(self[:object].read_string, :language => self[:object_literal_language].read_string)
345
- when self[:object_literal_datatype] && !self[:object_literal_datatype].null?
346
- RDF::Literal.new(self[:object].read_string, :datatype => V1_4.raptor_uri_to_string(self[:object_literal_datatype]))
347
- else
348
- RDF::Literal.new(self[:object].read_string)
349
- end
350
- end
351
- end
352
-
353
- ##
354
- # Set the object from an RDF::Value.
355
- # Value must be one of RDF::Resource or RDF::Literal.
356
- # @param [RDF::Value] value
357
- def object=(value)
358
- @object = nil
359
- case value
360
- when RDF::Node
361
- self[:object] = @mp[:object] = ::FFI::MemoryPointer.from_string(value.id.to_s)
362
- self[:object_type] = RAPTOR_IDENTIFIER_TYPE_ANONYMOUS
363
- when RDF::URI
364
- self[:object] = @mp[:object] = @raptor_uri_list.push(V1_4.raptor_new_uri(value.to_s)).last
365
- self[:object_type] = RAPTOR_IDENTIFIER_TYPE_RESOURCE
366
- when RDF::Literal
367
- self[:object_type] = RAPTOR_IDENTIFIER_TYPE_LITERAL
368
- self[:object] = @mp[:object] = ::FFI::MemoryPointer.from_string(value.value)
369
- self[:object_literal_datatype] = if value.datatype
370
- @raptor_uri_list.push(V1_4.raptor_new_uri(value.datatype.to_s)).last
371
- else
372
- nil
373
- end
374
- self[:object_literal_language] = @mp[:object_literal_language] = if value.language?
375
- ::FFI::MemoryPointer.from_string(value.language.to_s)
376
- else
377
- nil
378
- end
379
- else
380
- raise ArgumentError, "object must be of type RDF::Node, RDF::URI or RDF::Literal"
381
- end
382
- @object = value
383
- end
384
-
385
- ##
386
- # @return [String]
387
- def object_as_string
388
- V1_4.raptor_statement_part_as_string(
389
- self[:object],
390
- self[:object_type],
391
- self[:object_literal_datatype],
392
- self[:object_literal_language])
393
- end
394
-
395
- ##
396
- # @return [Array(RDF::Resource, RDF::URI, RDF::Value)]
397
- def to_triple
398
- [subject, predicate, object]
399
- end
400
-
401
- ##
402
- # @return [Array(RDF::Resource, RDF::URI, RDF::Value, nil)]
403
- def to_quad
404
- [subject, predicate, object, nil]
405
- end
406
-
407
- end
408
-
409
- # @see http://librdf.org/raptor/api/tutorial-initialising-finishing.html
410
- attach_function :raptor_init, [], :void
411
- attach_function :raptor_finish, [], :void
412
-
413
- # @see http://librdf.org/raptor/api/raptor-section-locator.html
414
- define_pointer :raptor_locator
415
- attach_function :raptor_locator_line, [raptor_locator], :int
416
- attach_function :raptor_locator_column, [raptor_locator], :int
417
- attach_function :raptor_locator_byte, [raptor_locator], :int
418
-
419
- # @see http://librdf.org/raptor/api/raptor-section-general.html
420
- attach_variable :raptor_version_major, :int
421
- attach_variable :raptor_version_minor, :int
422
- attach_variable :raptor_version_release, :int
423
- attach_variable :raptor_version_decimal, :int
424
- callback :raptor_message_handler, [:pointer, raptor_locator, :string], :void
425
-
426
- # @see http://librdf.org/raptor/api/raptor-section-uri.html
427
- define_pointer :raptor_uri
428
- attach_function :raptor_new_uri, [:string], raptor_uri
429
- attach_function :raptor_uri_as_string, [raptor_uri], :string
430
- attach_function :raptor_uri_to_string, [raptor_uri], :string
431
- attach_function :raptor_uri_print, [raptor_uri, :pointer], :void
432
- attach_function :raptor_free_uri, [raptor_uri], :void
433
-
434
- # @see http://librdf.org/raptor/api/raptor-section-triples.html
435
- define_pointer :raptor_identifier
436
- define_pointer :raptor_statement
437
- attach_function :raptor_statement_compare, [raptor_statement, raptor_statement], :int
438
- attach_function :raptor_print_statement, [raptor_statement, :pointer], :void
439
- attach_function :raptor_print_statement_as_ntriples, [:pointer, :pointer], :void
440
- attach_function :raptor_statement_part_as_string, [:pointer, :int, raptor_uri, :string], :string
441
-
442
- # @see http://librdf.org/raptor/api/raptor-section-parser.html
443
- callback :raptor_statement_handler, [:pointer, raptor_statement], :void
444
- define_pointer :raptor_parser
445
- attach_function :raptor_new_parser, [:string], raptor_parser
446
- attach_function :raptor_set_error_handler, [raptor_parser, :pointer, :raptor_message_handler], :void
447
- attach_function :raptor_set_warning_handler, [raptor_parser, :pointer, :raptor_message_handler], :void
448
- attach_function :raptor_set_statement_handler, [raptor_parser, :pointer, :raptor_statement_handler], :void
449
- attach_function :raptor_parse_file, [raptor_parser, raptor_uri, raptor_uri], :int
450
- attach_function :raptor_parse_file_stream, [raptor_parser, :pointer, :string, raptor_uri], :int
451
- attach_function :raptor_parse_uri, [raptor_parser, raptor_uri, raptor_uri], :int
452
- attach_function :raptor_start_parse, [raptor_parser, :string], :int
453
- attach_function :raptor_parse_chunk, [raptor_parser, :string, :size_t, :int], :int
454
- attach_function :raptor_get_mime_type, [raptor_parser], :string
455
- attach_function :raptor_set_parser_strict, [raptor_parser, :int], :void
456
- attach_function :raptor_get_need_base_uri, [raptor_parser], :int
457
- attach_function :raptor_free_parser, [raptor_parser], :void
458
-
459
- # @see http://librdf.org/raptor/api/raptor-section-iostream.html
460
- define_pointer :raptor_iostream
461
- callback :raptor_iostream_init_func, [:pointer], :int
462
- callback :raptor_iostream_finish_func, [:pointer], :void
463
- callback :raptor_iostream_write_byte_func, [:pointer, :int], :int
464
- callback :raptor_iostream_write_bytes_func, [:pointer, :pointer, :size_t, :size_t], :int
465
- callback :raptor_iostream_write_end_func, [:pointer], :void
466
- callback :raptor_iostream_read_bytes_func, [:pointer, :pointer, :size_t, :size_t], :int
467
- callback :raptor_iostream_read_eof_func, [:pointer], :int
468
- attach_function :raptor_new_iostream_from_handler2, [:pointer, :pointer], raptor_iostream
469
- attach_function :raptor_free_iostream, [raptor_iostream], :void
470
-
471
- class IOStreamHandler < ::FFI::Struct
472
- layout :version, :int,
473
- :init, :raptor_iostream_init_func,
474
- :finish, :raptor_iostream_finish_func,
475
- :write_byte, :raptor_iostream_write_byte_func,
476
- :write_bytes, :raptor_iostream_write_bytes_func,
477
- :write_end, :raptor_iostream_write_end_func,
478
- :read_bytes, :raptor_iostream_read_bytes_func,
479
- :read_eof, :raptor_iostream_read_eof_func
480
-
481
- attr_accessor :rubyio
482
-
483
- def initialize(*args)
484
- super
485
- # Keep a ruby land reference to our procs so they don't
486
- # get snatched by GC.
487
- @procs = {}
488
-
489
- self[:version] = 2
490
-
491
- # @procs[:init] = self[:init] = Proc.new do |context|
492
- # $stderr.puts("#{self.class}: init")
493
- # end
494
- # @procs[:finish] = self[:finish] = Proc.new do |context|
495
- # $stderr.puts("#{self.class}: finish")
496
- # end
497
- @procs[:write_byte] = self[:write_byte] = Proc.new do |context, byte|
498
- begin
499
- @rubyio.putc(byte)
500
- rescue
501
- return 1
502
- end
503
- 0
504
- end
505
- @procs[:write_bytes] = self[:write_bytes] = Proc.new do |context, data, size, nmemb|
506
- begin
507
- @rubyio.write(data.read_string(size * nmemb))
508
- rescue
509
- return 1
510
- end
511
- 0
512
- end
513
- # @procs[:write_end] = self[:write_end] = Proc.new do |context|
514
- # $stderr.puts("#{self.class}: write_end")
515
- # end
516
- # @procs[:read_bytes] = self[:read_bytes] = Proc.new do |context, data, size, nmemb|
517
- # $stderr.puts("#{self.class}: read_bytes")
518
- # end
519
- # @procs[:read_eof] = self[:read_eof] = Proc.new do |context|
520
- # $stderr.puts("#{self.class}: read_eof")
521
- # end
522
- end
523
- end
524
-
525
- # @see http://librdf.org/raptor/api/raptor-section-xml-namespace.html
526
- define_pointer :raptor_namespace
527
-
528
- # @see http://librdf.org/raptor/api/raptor-section-serializer.html
529
- define_pointer :raptor_serializer
530
- attach_function :raptor_new_serializer, [:string], raptor_serializer
531
- attach_function :raptor_free_serializer, [raptor_serializer], :void
532
- attach_function :raptor_serialize_start_to_iostream, [raptor_serializer, raptor_uri, raptor_iostream], :int
533
- attach_function :raptor_serialize_start_to_filename, [raptor_serializer, :string], :int
534
- attach_function :raptor_serialize_statement, [raptor_serializer, raptor_statement], :int
535
- attach_function :raptor_serialize_end, [raptor_serializer], :int
536
- attach_function :raptor_serializer_set_error_handler, [raptor_serializer, :pointer, :raptor_message_handler], :void
537
- attach_function :raptor_serializer_set_warning_handler, [raptor_serializer, :pointer, :raptor_message_handler], :void
538
-
539
- # Initialize the world
540
- # We do this exactly once and never release because we can't delegate
541
- # any memory management to the ruby GC.
542
- # Internally raptor_init/raptor_finish work with ref-counts.
543
- raptor_init
544
-
545
- end
546
- end
547
- end
548
-
193
+ ffi_lib ::FFI::Library::LIBC
194
+ attach_function :strlen, [:pointer], :size_t
195
+ end # LibC
196
+ end # FFI
197
+ end # RDF::Raptor