graffiti 2.1
Sign up to get free protection for your applications and to get access to all the features.
- data/COPYING +676 -0
- data/ChangeLog.mtn +233 -0
- data/README.rdoc +129 -0
- data/TODO +30 -0
- data/doc/diagrams/graffiti-classes.svg +157 -0
- data/doc/diagrams/graffiti-deployment.svg +117 -0
- data/doc/diagrams/graffiti-store-sequence.svg +69 -0
- data/doc/diagrams/squish-select-sequence.svg +266 -0
- data/doc/examples/samizdat-rdf-config.yaml +77 -0
- data/doc/examples/samizdat-triggers-pgsql.sql +266 -0
- data/doc/papers/collreif.tex +462 -0
- data/doc/papers/rdf-to-relational-query-translation-icis2009.tex +936 -0
- data/doc/papers/rel-rdf.tex +545 -0
- data/doc/rdf-impl-report.txt +126 -0
- data/graffiti.gemspec +21 -0
- data/lib/graffiti.rb +15 -0
- data/lib/graffiti/debug.rb +34 -0
- data/lib/graffiti/exceptions.rb +20 -0
- data/lib/graffiti/rdf_config.rb +78 -0
- data/lib/graffiti/rdf_property_map.rb +92 -0
- data/lib/graffiti/sql_mapper.rb +916 -0
- data/lib/graffiti/squish.rb +568 -0
- data/lib/graffiti/store.rb +100 -0
- data/setup.rb +1360 -0
- data/test/ts_graffiti.rb +455 -0
- metadata +122 -0
@@ -0,0 +1,126 @@
|
|
1
|
+
Samizdat RDF Implementation Report
|
2
|
+
==================================
|
3
|
+
|
4
|
+
http://lists.w3.org/Archives/Public/www-rdf-interest/2003Sep/0043.html
|
5
|
+
|
6
|
+
Implementation
|
7
|
+
--------------
|
8
|
+
|
9
|
+
http://www.nongnu.org/samizdat/
|
10
|
+
|
11
|
+
Samizdat is a generic RDF-based engine for building collaboration and
|
12
|
+
open publishing web sites. Samizdat will let everyone publish, view,
|
13
|
+
comment, edit, and aggregate text and multimedia resources, vote on
|
14
|
+
ratings and classifications, filter resources by flexible sets of
|
15
|
+
criteria, cooperate and coordinate on all kinds of activities (see
|
16
|
+
Design Goals document). Samizdat intends to promote values of freedom,
|
17
|
+
openness, equality, and cooperation.
|
18
|
+
|
19
|
+
Samizdat engine is implemented using Ruby programming language, Apache
|
20
|
+
mod_ruby module, and PostgreSQL RDBMS, and is available under the GNU
|
21
|
+
General Public License, version 2 or later.
|
22
|
+
|
23
|
+
Project development started in December 2002, first public release was
|
24
|
+
announced in June 2003. This report refers to the Samizdat 0.0.4,
|
25
|
+
released on 2003-09-01.
|
26
|
+
|
27
|
+
Functionality covered by this version includes: registering site
|
28
|
+
members, publishing and replying to messages, uploading multimedia
|
29
|
+
messages, voting on standard tags on resources; hand-editing or using
|
30
|
+
GUI for constructing and publishing Squish queries that can be used to
|
31
|
+
search and filter site resources.
|
32
|
+
|
33
|
+
|
34
|
+
RDF Schema
|
35
|
+
----------
|
36
|
+
|
37
|
+
Samizdat defines its own RDF schema for description of site members,
|
38
|
+
published messages, votes, and other site resources (see Concepts
|
39
|
+
document). One of the outstanding features of Samizdat schema is the use
|
40
|
+
of statement reification in approval of content classification with
|
41
|
+
votes cast by site members.
|
42
|
+
|
43
|
+
Samizdat RDF schema uses Dublin Core metadata where applicable; also,
|
44
|
+
integration of site member descriptions with FOAF is planned.
|
45
|
+
|
46
|
+
One of the problems encountered in Samizdat RDF Schema development was
|
47
|
+
the lack of standard metadata describing discussion threads. While other
|
48
|
+
properties defined in Samizdat schema denote Samizdat-specific concepts,
|
49
|
+
such as "vote" and "rating", it is more desirable to use commonly agreed
|
50
|
+
metadata for threading structure in place of implementation-local
|
51
|
+
"thread" and "inReplyTo" properties.
|
52
|
+
|
53
|
+
|
54
|
+
RDF Import and Export
|
55
|
+
---------------------
|
56
|
+
|
57
|
+
While Samizdat model follows RDF Concepts and RDF Semantics
|
58
|
+
recommendations (with the exceptions put down below), the engine does
|
59
|
+
not externally interchange RDF data and thus does not use RDF/XML or
|
60
|
+
other RDF serialization format. It is assumed that, when the need for
|
61
|
+
RDF import and export arises, it can be implemented externally on top of
|
62
|
+
the Samizdat RDF storage module and using existing RDF frameworks such
|
63
|
+
as Redland.
|
64
|
+
|
65
|
+
|
66
|
+
Datatyped Literals
|
67
|
+
------------------
|
68
|
+
|
69
|
+
Samizdat doesn't implement datatyped literals, and relies on underlying
|
70
|
+
PostgreSQL capabilities for mapping between literal values and their
|
71
|
+
string representations. Outside of SQL context, literals are interpreted
|
72
|
+
as opaque strings; XML literals are not treated specially, and datatype
|
73
|
+
information is not preserved.
|
74
|
+
|
75
|
+
However, support of XML schema datatypes is considered necessary in
|
76
|
+
order to untie a Samizdat knowledge base from specifics of underlying
|
77
|
+
RDF storage, and will be implemented as a prerequisite for migration to
|
78
|
+
a selection of alternative RDF storage backends (candidates are FramerD,
|
79
|
+
3store, and Redland).
|
80
|
+
|
81
|
+
|
82
|
+
Language Tags
|
83
|
+
-------------
|
84
|
+
|
85
|
+
Literal language tags are not honoured, "dc:language" property is
|
86
|
+
supposed to be used to denote message language.
|
87
|
+
|
88
|
+
|
89
|
+
Entailments
|
90
|
+
-----------
|
91
|
+
|
92
|
+
Samizdat RDF storage only implements simple entailment, vocabulary
|
93
|
+
entailment is not implemented yet. At the moment, simple entailment
|
94
|
+
suffices for all features of the Samizdat engine. If and when vocabulary
|
95
|
+
entailment becomes necessary, it will be implemented in Samizdat RDF
|
96
|
+
storage module or relegated to an alternative RDF storage backend,
|
97
|
+
depending on status of backend alternatives for Samizdat at that time.
|
98
|
+
|
99
|
+
|
100
|
+
Query Support
|
101
|
+
-------------
|
102
|
+
|
103
|
+
Samizdat RDF storage implements a translation of RDF query graphs
|
104
|
+
written in extended Squish into relational SQL queries and allows purely
|
105
|
+
relational representation of selected properties of site resources (see
|
106
|
+
RDF Storage and Storage Implementation documents).
|
107
|
+
|
108
|
+
It must be noted that at the moment, status of RDF query language
|
109
|
+
standards is found unsatisfactory.
|
110
|
+
|
111
|
+
DAML Query Language abstract specification provides excellent formal
|
112
|
+
basis, but does not encompass all capabilities of existing RDF query
|
113
|
+
languages. Also, existing query languages are limited in one way or
|
114
|
+
another, are underformalized (most are defined by single
|
115
|
+
implementation), and often overloaded with baroque syntax.
|
116
|
+
|
117
|
+
Two major features that were missed the most in existing query languages
|
118
|
+
at the time of Samizdat RDF storage implementation were: knowledge base
|
119
|
+
update allowing to merge complex constructs into the site KB graph
|
120
|
+
(implemented in Samizdat RDF Data Manipulation Language), and workflow
|
121
|
+
control providing at least transaction rollback (in Samizdat, underlying
|
122
|
+
PostgreSQL transactions are used). Other Squish extensions implemented
|
123
|
+
in Samizdat are literal conditions and answer collection ordering
|
124
|
+
(currently, relegated to PostgreSQL; ideally, interpreted according to
|
125
|
+
literal datatypes).
|
126
|
+
|
data/graffiti.gemspec
ADDED
@@ -0,0 +1,21 @@
|
|
1
|
+
Gem::Specification.new do |spec|
|
2
|
+
spec.name = 'graffiti'
|
3
|
+
spec.version = '2.1'
|
4
|
+
spec.author = 'Dmitry Borodaenko'
|
5
|
+
spec.email = 'angdraug@debian.org'
|
6
|
+
spec.homepage = 'https://github.com/angdraug/graffiti'
|
7
|
+
spec.summary = 'Relational RDF store for Ruby'
|
8
|
+
spec.description = <<-EOF
|
9
|
+
Graffiti is an RDF store based on dynamic translation of RDF queries into SQL.
|
10
|
+
Graffiti allows one to map any relational database schema into RDF semantics
|
11
|
+
and vice versa, to store any RDF data in a relational database.
|
12
|
+
|
13
|
+
Graffiti uses Sequel to connect to database backend and provides a DBI-like
|
14
|
+
interface to run RDF queries in Squish query language from Ruby applications.
|
15
|
+
EOF
|
16
|
+
spec.files = `git ls-files`.split "\n"
|
17
|
+
spec.test_files = Dir['test/ts_*.rb']
|
18
|
+
spec.license = 'GPL3+'
|
19
|
+
spec.add_dependency('syncache')
|
20
|
+
spec.add_dependency('sequel')
|
21
|
+
end
|
data/lib/graffiti.rb
ADDED
@@ -0,0 +1,15 @@
|
|
1
|
+
# Graffiti RDF Store
|
2
|
+
# (originally written for Samizdat project)
|
3
|
+
#
|
4
|
+
# Copyright (c) 2002-2009 Dmitry Borodaenko <angdraug@debian.org>
|
5
|
+
#
|
6
|
+
# This program is free software.
|
7
|
+
# You can distribute/modify this program under the terms of
|
8
|
+
# the GNU General Public License version 3 or later.
|
9
|
+
#
|
10
|
+
# see doc/rdf-storage.txt for introduction and Graffiti Squish definition;
|
11
|
+
# see doc/storage-impl.txt for explanation of implemented algorithms
|
12
|
+
#
|
13
|
+
# vim: et sw=2 sts=2 ts=8 tw=0
|
14
|
+
|
15
|
+
require 'graffiti/store'
|
@@ -0,0 +1,34 @@
|
|
1
|
+
# Graffiti RDF Store
|
2
|
+
# (originally written for Samizdat project)
|
3
|
+
#
|
4
|
+
# Copyright (c) 2002-2011 Dmitry Borodaenko <angdraug@debian.org>
|
5
|
+
#
|
6
|
+
# This program is free software.
|
7
|
+
# You can distribute/modify this program under the terms of
|
8
|
+
# the GNU General Public License version 3 or later.
|
9
|
+
#
|
10
|
+
# see doc/rdf-storage.txt for introduction and Graffiti Squish definition;
|
11
|
+
# see doc/storage-impl.txt for explanation of implemented algorithms
|
12
|
+
#
|
13
|
+
# vim: et sw=2 sts=2 ts=8 tw=0
|
14
|
+
|
15
|
+
module Graffiti
|
16
|
+
|
17
|
+
module Debug
|
18
|
+
private
|
19
|
+
|
20
|
+
DEBUG = false
|
21
|
+
|
22
|
+
def debug(message = nil)
|
23
|
+
return unless DEBUG
|
24
|
+
|
25
|
+
log message if message
|
26
|
+
log yield if block_given?
|
27
|
+
end
|
28
|
+
|
29
|
+
def log(message)
|
30
|
+
STDERR << 'Graffiti: ' << message.to_s << "\n"
|
31
|
+
end
|
32
|
+
end
|
33
|
+
|
34
|
+
end
|
@@ -0,0 +1,20 @@
|
|
1
|
+
# Graffiti RDF Store
|
2
|
+
# (originally written for Samizdat project)
|
3
|
+
#
|
4
|
+
# Copyright (c) 2002-2009 Dmitry Borodaenko <angdraug@debian.org>
|
5
|
+
#
|
6
|
+
# This program is free software.
|
7
|
+
# You can distribute/modify this program under the terms of
|
8
|
+
# the GNU General Public License version 3 or later.
|
9
|
+
#
|
10
|
+
# see doc/rdf-storage.txt for introduction and Graffiti Squish definition;
|
11
|
+
# see doc/storage-impl.txt for explanation of implemented algorithms
|
12
|
+
#
|
13
|
+
# vim: et sw=2 sts=2 ts=8 tw=0
|
14
|
+
|
15
|
+
module Graffiti
|
16
|
+
|
17
|
+
# raised for syntax errors in Squish statements
|
18
|
+
class ProgrammingError < RuntimeError; end
|
19
|
+
|
20
|
+
end
|
@@ -0,0 +1,78 @@
|
|
1
|
+
# Graffiti RDF Store
|
2
|
+
# (originally written for Samizdat project)
|
3
|
+
#
|
4
|
+
# Copyright (c) 2002-2011 Dmitry Borodaenko <angdraug@debian.org>
|
5
|
+
#
|
6
|
+
# This program is free software.
|
7
|
+
# You can distribute/modify this program under the terms of
|
8
|
+
# the GNU General Public License version 3 or later.
|
9
|
+
#
|
10
|
+
# see doc/rdf-storage.txt for introduction and Graffiti Squish definition;
|
11
|
+
# see doc/storage-impl.txt for explanation of implemented algorithms
|
12
|
+
#
|
13
|
+
# vim: et sw=2 sts=2 ts=8 tw=0
|
14
|
+
|
15
|
+
require 'graffiti/rdf_property_map'
|
16
|
+
|
17
|
+
module Graffiti
|
18
|
+
|
19
|
+
# Configuration of relational RDF storage (see examples)
|
20
|
+
#
|
21
|
+
class RdfConfig
|
22
|
+
def initialize(config)
|
23
|
+
@ns = config['ns']
|
24
|
+
|
25
|
+
@map = {}
|
26
|
+
|
27
|
+
config['map'].each_pair do |p, m|
|
28
|
+
table, field = m.to_a.first
|
29
|
+
p = ns_expand(p)
|
30
|
+
@map[p] = RdfPropertyMap.new(p, table, field)
|
31
|
+
end
|
32
|
+
|
33
|
+
if config['subproperties'].kind_of? Hash
|
34
|
+
config['subproperties'].each_pair do |p, subproperties|
|
35
|
+
p = ns_expand(p)
|
36
|
+
map = @map[p] or raise RuntimeError,
|
37
|
+
"Incorrect RDF storage configuration: superproperty #{p} must be mapped"
|
38
|
+
map.superproperty = true
|
39
|
+
|
40
|
+
qualifier = RdfPropertyMap.qualifier_property(p)
|
41
|
+
@map[qualifier] = RdfPropertyMap.new(
|
42
|
+
qualifier, map.table, RdfPropertyMap.qualifier_field(map.field))
|
43
|
+
|
44
|
+
subproperties.each do |subp|
|
45
|
+
subp = ns_expand(subp)
|
46
|
+
@map[subp] = RdfPropertyMap.new(subp, map.table, map.field)
|
47
|
+
@map[subp].subproperty_of = p
|
48
|
+
end
|
49
|
+
end
|
50
|
+
end
|
51
|
+
|
52
|
+
if config['transitive_closure'].kind_of? Hash
|
53
|
+
config['transitive_closure'].each_pair do |p, table|
|
54
|
+
@map[ ns_expand(p) ].transitive_closure = table
|
55
|
+
|
56
|
+
if config['subproperties'].kind_of?(Hash) and config['subproperties'][p]
|
57
|
+
config['subproperties'][p].each do |subp|
|
58
|
+
@map[ ns_expand(subp) ].transitive_closure = table
|
59
|
+
end
|
60
|
+
end
|
61
|
+
end
|
62
|
+
end
|
63
|
+
end
|
64
|
+
|
65
|
+
# hash of namespaces
|
66
|
+
attr_reader :ns
|
67
|
+
|
68
|
+
# map internal property names with expanded namespaces to RdfPropertyMap
|
69
|
+
# objects
|
70
|
+
#
|
71
|
+
attr_reader :map
|
72
|
+
|
73
|
+
def ns_expand(p)
|
74
|
+
p and p.sub(/\A(\S+?)::/) { @ns[$1] }
|
75
|
+
end
|
76
|
+
end
|
77
|
+
|
78
|
+
end
|
@@ -0,0 +1,92 @@
|
|
1
|
+
# Graffiti RDF Store
|
2
|
+
# (originally written for Samizdat project)
|
3
|
+
#
|
4
|
+
# Copyright (c) 2002-2011 Dmitry Borodaenko <angdraug@debian.org>
|
5
|
+
#
|
6
|
+
# This program is free software.
|
7
|
+
# You can distribute/modify this program under the terms of
|
8
|
+
# the GNU General Public License version 3 or later.
|
9
|
+
#
|
10
|
+
# see doc/rdf-storage.txt for introduction and Graffiti Squish definition;
|
11
|
+
# see doc/storage-impl.txt for explanation of implemented algorithms
|
12
|
+
#
|
13
|
+
# vim: et sw=2 sts=2 ts=8 tw=0
|
14
|
+
|
15
|
+
module Graffiti
|
16
|
+
|
17
|
+
# Map of an internal RDF property into relational storage
|
18
|
+
#
|
19
|
+
class RdfPropertyMap
|
20
|
+
|
21
|
+
# special qualifier map
|
22
|
+
#
|
23
|
+
# ' ' is added to the property name to make sure it can't clash with any
|
24
|
+
# valid property uriref
|
25
|
+
#
|
26
|
+
def RdfPropertyMap.qualifier_property(property, type = 'subproperty')
|
27
|
+
property + ' ' + type
|
28
|
+
end
|
29
|
+
|
30
|
+
# special qualifier field
|
31
|
+
#
|
32
|
+
def RdfPropertyMap.qualifier_field(field, type = 'subproperty')
|
33
|
+
field + '_' + type
|
34
|
+
end
|
35
|
+
|
36
|
+
def initialize(property, table, field)
|
37
|
+
# fixme: support ambiguous mappings
|
38
|
+
@property = property
|
39
|
+
@table = table
|
40
|
+
@field = field
|
41
|
+
end
|
42
|
+
|
43
|
+
# expanded uriref of the mapped property
|
44
|
+
#
|
45
|
+
attr_reader :property
|
46
|
+
|
47
|
+
# name of the table into which the property is mapped (property domain is an
|
48
|
+
# internal resource class mapped into this table)
|
49
|
+
#
|
50
|
+
attr_reader :table
|
51
|
+
|
52
|
+
# name of the field into which the property is mapped
|
53
|
+
#
|
54
|
+
# if property range is not a literal, the field is a reference to the
|
55
|
+
# resource table
|
56
|
+
#
|
57
|
+
attr_reader :field
|
58
|
+
|
59
|
+
# expanded uriref of the property which this property is a subproperty of
|
60
|
+
#
|
61
|
+
# if set, this property maps into the same table and field as its
|
62
|
+
# superproperty, and is qualified by an additional field named
|
63
|
+
# <field>_subproperty which refers to a uriref resource holding uriref of
|
64
|
+
# this subproperty
|
65
|
+
#
|
66
|
+
attr_accessor :subproperty_of
|
67
|
+
|
68
|
+
attr_writer :superproperty
|
69
|
+
|
70
|
+
# set to +true+ if this property has subproperties
|
71
|
+
#
|
72
|
+
def superproperty?
|
73
|
+
@superproperty or false
|
74
|
+
end
|
75
|
+
|
76
|
+
# name of transitive closure table for a transitive property
|
77
|
+
#
|
78
|
+
# the format of a transitive closure table is:
|
79
|
+
#
|
80
|
+
# - 'resource' field refers to the subject resource id
|
81
|
+
# - '<field>' property field and '<field>_subproperty' qualifier field (in
|
82
|
+
# case of subproperty) have the same name as in the main table
|
83
|
+
# - 'distance' field holds the distance from subject to object in the RDF
|
84
|
+
# graph
|
85
|
+
#
|
86
|
+
# the transitive closure table is automatically updated by a trigger on every
|
87
|
+
# update of the main table
|
88
|
+
#
|
89
|
+
attr_accessor :transitive_closure
|
90
|
+
end
|
91
|
+
|
92
|
+
end
|
@@ -0,0 +1,916 @@
|
|
1
|
+
# Graffiti RDF Store
|
2
|
+
# (originally written for Samizdat project)
|
3
|
+
#
|
4
|
+
# Copyright (c) 2002-2011 Dmitry Borodaenko <angdraug@debian.org>
|
5
|
+
#
|
6
|
+
# This program is free software.
|
7
|
+
# You can distribute/modify this program under the terms of
|
8
|
+
# the GNU General Public License version 3 or later.
|
9
|
+
#
|
10
|
+
# see doc/rdf-storage.txt for introduction and Graffiti Squish definition;
|
11
|
+
# see doc/storage-impl.txt for explanation of implemented algorithms
|
12
|
+
#
|
13
|
+
# vim: et sw=2 sts=2 ts=8 tw=0
|
14
|
+
|
15
|
+
require 'delegate'
|
16
|
+
require 'uri/common'
|
17
|
+
require 'graffiti/rdf_property_map'
|
18
|
+
require 'graffiti/squish'
|
19
|
+
|
20
|
+
module Graffiti
|
21
|
+
|
22
|
+
class SqlNodeBinding
|
23
|
+
def initialize(table_alias, field)
|
24
|
+
@alias = table_alias
|
25
|
+
@field = field
|
26
|
+
end
|
27
|
+
|
28
|
+
attr_reader :alias, :field
|
29
|
+
|
30
|
+
def to_s
|
31
|
+
@alias + '.' + @field
|
32
|
+
end
|
33
|
+
|
34
|
+
alias :inspect :to_s
|
35
|
+
|
36
|
+
def eql?(binding)
|
37
|
+
@alias == binding.alias and @field == binding.field
|
38
|
+
end
|
39
|
+
|
40
|
+
alias :'==' :eql?
|
41
|
+
|
42
|
+
def hash
|
43
|
+
self.to_s.hash
|
44
|
+
end
|
45
|
+
end
|
46
|
+
|
47
|
+
|
48
|
+
class SqlExpression < DelegateClass(Array)
|
49
|
+
def initialize(*parts)
|
50
|
+
super parts
|
51
|
+
end
|
52
|
+
|
53
|
+
def to_s
|
54
|
+
'(' << self.join(' ') << ')'
|
55
|
+
end
|
56
|
+
|
57
|
+
alias :to_str :to_s
|
58
|
+
|
59
|
+
def traverse(&block)
|
60
|
+
self.each do |part|
|
61
|
+
case part
|
62
|
+
when SqlExpression
|
63
|
+
part.traverse(&block)
|
64
|
+
else
|
65
|
+
yield
|
66
|
+
end
|
67
|
+
end
|
68
|
+
end
|
69
|
+
|
70
|
+
def rebind!(rebind, &block)
|
71
|
+
self.each_with_index do |part, i|
|
72
|
+
case part
|
73
|
+
when SqlExpression
|
74
|
+
part.rebind!(rebind, &block)
|
75
|
+
when SqlNodeBinding
|
76
|
+
if rebind[part]
|
77
|
+
self[i] = rebind[part]
|
78
|
+
yield part if block_given?
|
79
|
+
end
|
80
|
+
end
|
81
|
+
end
|
82
|
+
end
|
83
|
+
|
84
|
+
alias :eql? :'=='
|
85
|
+
|
86
|
+
def hash
|
87
|
+
self.to_s.hash
|
88
|
+
end
|
89
|
+
end
|
90
|
+
|
91
|
+
|
92
|
+
# Transform RDF query pattern graph into a relational join expression.
|
93
|
+
#
|
94
|
+
class SqlMapper
|
95
|
+
include Debug
|
96
|
+
|
97
|
+
def initialize(config, pattern, negative = [], optional = [], global_filter = '')
|
98
|
+
@config = config
|
99
|
+
@global_filter = global_filter
|
100
|
+
|
101
|
+
check_graph(pattern)
|
102
|
+
negative.empty? or check_graph(pattern + negative)
|
103
|
+
optional.empty? or check_graph(pattern + optional)
|
104
|
+
|
105
|
+
map_predicates(pattern, negative, optional)
|
106
|
+
transform
|
107
|
+
generate_tables_and_conditions
|
108
|
+
|
109
|
+
@jc = @aliases = @ac = @global_filter = nil
|
110
|
+
end
|
111
|
+
|
112
|
+
# map clause position to table, field, and table alias
|
113
|
+
#
|
114
|
+
# position => {
|
115
|
+
# :subject => {
|
116
|
+
# :node => node,
|
117
|
+
# :field => field
|
118
|
+
# },
|
119
|
+
# :object => {
|
120
|
+
# :node => node,
|
121
|
+
# :field => field
|
122
|
+
# },
|
123
|
+
# :map => RdfPropertyMap,
|
124
|
+
# :bind_mode => < :must_bind | :may_bind | :must_not_bind >,
|
125
|
+
# :alias => alias
|
126
|
+
# }
|
127
|
+
#
|
128
|
+
attr_reader :clauses
|
129
|
+
|
130
|
+
# map node to list of positions in clauses
|
131
|
+
#
|
132
|
+
# node => {
|
133
|
+
# :positions => [
|
134
|
+
# { :clause => position, :role => < :subject | :object > }
|
135
|
+
# ],
|
136
|
+
# :bind_mode => < :must_bind | :may_bind | :must_not_bind >,
|
137
|
+
# :colors => { color1 => bind_mode1, ... },
|
138
|
+
# :ground => < true | false >
|
139
|
+
# }
|
140
|
+
#
|
141
|
+
attr_reader :nodes
|
142
|
+
|
143
|
+
# list of tables for FROM clause of SQL query
|
144
|
+
attr_reader :from
|
145
|
+
|
146
|
+
# conditions for WHERE clause of SQL query
|
147
|
+
attr_reader :where
|
148
|
+
|
149
|
+
# return node's binding, raise exception if the node isn't bound
|
150
|
+
#
|
151
|
+
def bind(node)
|
152
|
+
(@nodes[node] and @bindings[node] and (binding = @bindings[node].first)
|
153
|
+
) or raise ProgrammingError,
|
154
|
+
"Node '#{node}' is not bound by the query pattern"
|
155
|
+
|
156
|
+
@nodes[node][:positions].each do |p|
|
157
|
+
if :object == p[:role] and @clauses[ p[:clause] ][:map].subproperty_of
|
158
|
+
|
159
|
+
property = @clauses[ p[:clause] ][:map].property
|
160
|
+
return %{select_subproperty(#{binding}, #{bind(property)})}
|
161
|
+
end
|
162
|
+
end
|
163
|
+
|
164
|
+
binding
|
165
|
+
end
|
166
|
+
|
167
|
+
private
|
168
|
+
|
169
|
+
# Check whether pattern is not a disjoint graph (all nodes are
|
170
|
+
# undirectionally reachable from one node).
|
171
|
+
#
|
172
|
+
def check_graph(pattern)
|
173
|
+
nodes = pattern.transpose[1, 2].flatten.uniq # all nodes
|
174
|
+
|
175
|
+
seen = [ nodes.shift ]
|
176
|
+
found_more = true
|
177
|
+
|
178
|
+
while found_more and not nodes.empty?
|
179
|
+
found_more = false
|
180
|
+
|
181
|
+
pattern.each do |predicate, subject, object|
|
182
|
+
|
183
|
+
if seen.include?(subject) and nodes.include?(object)
|
184
|
+
seen.push(object)
|
185
|
+
nodes.delete(object)
|
186
|
+
found_more = true
|
187
|
+
|
188
|
+
elsif seen.include?(object) and nodes.include?(subject)
|
189
|
+
seen.push(subject)
|
190
|
+
nodes.delete(subject)
|
191
|
+
found_more = true
|
192
|
+
end
|
193
|
+
end
|
194
|
+
end
|
195
|
+
|
196
|
+
nodes.empty? or raise ProgrammingError, "Query pattern is a disjoint graph"
|
197
|
+
end
|
198
|
+
|
199
|
+
# Stage 1: Predicate Mapping (storage-impl.txt).
|
200
|
+
#
|
201
|
+
def map_predicates(pattern, negative, optional)
|
202
|
+
@nodes = {}
|
203
|
+
@clauses = []
|
204
|
+
|
205
|
+
map_pattern(pattern, :must_bind)
|
206
|
+
map_pattern(negative, :must_not_bind)
|
207
|
+
map_pattern(optional, :may_bind)
|
208
|
+
|
209
|
+
@color_counter = @must_bind_nodes = nil
|
210
|
+
|
211
|
+
refine_ambiguous_properties
|
212
|
+
|
213
|
+
debug do
|
214
|
+
@nodes.each do |node, n|
|
215
|
+
debug %{#{node}: #{n[:bind_mode]} #{n[:colors].inspect}}
|
216
|
+
end
|
217
|
+
end
|
218
|
+
end
|
219
|
+
|
220
|
+
# Label every connected component of the pattern with a different color.
|
221
|
+
#
|
222
|
+
# Pattern clause positions:
|
223
|
+
#
|
224
|
+
# 0. predicate
|
225
|
+
# 1. subject
|
226
|
+
# 2. object
|
227
|
+
# 3. filter
|
228
|
+
#
|
229
|
+
# Returns hash of node colors.
|
230
|
+
#
|
231
|
+
# Implements the {Two-pass Connected Component Labeling algorithm}
|
232
|
+
# [http://en.wikipedia.org/wiki/Connected_Component_Labeling#Two-pass]
|
233
|
+
# with an added special case to exclude _alien_nodes_ from neighbor lists.
|
234
|
+
#
|
235
|
+
# The special case ensures that parts of a may-bind or must-not-bind
|
236
|
+
# subpattern that are only connected through a must-bind node do not connect.
|
237
|
+
#
|
238
|
+
def label_pattern_components(pattern, alien_nodes, augment_alien_nodes = false)
|
239
|
+
return {} if pattern.empty?
|
240
|
+
|
241
|
+
color = {}
|
242
|
+
color_eq = [] # [ [ smaller, larger ], ... ]
|
243
|
+
nodes = pattern.transpose[1, 2].flatten.uniq
|
244
|
+
alien_nodes_here = nodes & alien_nodes
|
245
|
+
|
246
|
+
@color_counter = @color_counter ? @color_counter.next : 0
|
247
|
+
color[ nodes[0] ] = @color_counter
|
248
|
+
|
249
|
+
# first pass
|
250
|
+
1.upto(nodes.size - 1) do |i|
|
251
|
+
node = nodes[i]
|
252
|
+
|
253
|
+
pattern.each do |predicate, subject, object, filter|
|
254
|
+
if node == subject
|
255
|
+
neighbor = object
|
256
|
+
elsif node == object
|
257
|
+
neighbor = subject
|
258
|
+
end
|
259
|
+
next if neighbor.nil? or color[neighbor].nil? or
|
260
|
+
alien_nodes_here.include?(neighbor)
|
261
|
+
|
262
|
+
if color[node].nil?
|
263
|
+
color[node] = color[neighbor]
|
264
|
+
elsif color[node] != color[neighbor] # record color equivalence
|
265
|
+
color_eq |= [ [ color[node], color[neighbor] ].sort ]
|
266
|
+
end
|
267
|
+
end
|
268
|
+
|
269
|
+
color[node] ||= (@color_counter += 1)
|
270
|
+
end
|
271
|
+
|
272
|
+
# second pass
|
273
|
+
nodes.each do |node|
|
274
|
+
while eq = color_eq.rassoc(color[node])
|
275
|
+
color[node] = eq[0]
|
276
|
+
end
|
277
|
+
end
|
278
|
+
|
279
|
+
alien_nodes.push(*nodes).uniq! if augment_alien_nodes
|
280
|
+
|
281
|
+
color
|
282
|
+
end
|
283
|
+
|
284
|
+
def map_pattern(pattern, bind_mode = :must_bind)
|
285
|
+
pattern = pattern.dup
|
286
|
+
@must_bind_nodes ||= []
|
287
|
+
color = label_pattern_components(pattern, @must_bind_nodes, :must_bind == bind_mode)
|
288
|
+
|
289
|
+
pattern.each do |predicate, subject, object, filter, transitive|
|
290
|
+
|
291
|
+
# validate the triple
|
292
|
+
predicate =~ URI::URI_REF or raise ProgrammingError,
|
293
|
+
"Valid uriref expected in predicate position instead of '#{predicate}'"
|
294
|
+
|
295
|
+
[subject, object].each do |node|
|
296
|
+
node =~ SquishQuery::INTERNAL or
|
297
|
+
node =~ SquishQuery::BN or
|
298
|
+
node =~ URI::URI_REF or
|
299
|
+
raise ProgrammingError,
|
300
|
+
"Resource or blank node name expected instead of '#{node}'"
|
301
|
+
end
|
302
|
+
|
303
|
+
# list of possible mappings into internal tables
|
304
|
+
map = @config.map[predicate]
|
305
|
+
|
306
|
+
if transitive and map.transitive_closure.nil?
|
307
|
+
raise ProgrammingError,
|
308
|
+
"No transitive closure is defined for #{predicate} property"
|
309
|
+
end
|
310
|
+
|
311
|
+
if map and
|
312
|
+
(subject =~ SquishQuery::BN or
|
313
|
+
subject =~ SquishQuery::INTERNAL or
|
314
|
+
subject =~ SquishQuery::PARAMETER or
|
315
|
+
'resource' == map.table)
|
316
|
+
# internal predicate and subject is mappable to resource table
|
317
|
+
|
318
|
+
i = clauses.size
|
319
|
+
|
320
|
+
@clauses[i] = {
|
321
|
+
:subject => [ { :node => subject, :field => 'id' } ],
|
322
|
+
:object => [ { :node => object, :field => map.field } ],
|
323
|
+
:map => map,
|
324
|
+
:transitive => transitive,
|
325
|
+
:bind_mode => bind_mode
|
326
|
+
}
|
327
|
+
@clauses[i][:filter] = SqlExpression.new(filter) if filter
|
328
|
+
|
329
|
+
[subject, object].each do |node|
|
330
|
+
if @nodes[node]
|
331
|
+
@nodes[node][:bind_mode] =
|
332
|
+
stronger_bind_mode(@nodes[node][:bind_mode], bind_mode)
|
333
|
+
else
|
334
|
+
@nodes[node] = { :positions => [], :bind_mode => bind_mode, :colors => {} }
|
335
|
+
end
|
336
|
+
|
337
|
+
# set of node colors, one for each bind_mode
|
338
|
+
@nodes[node][:colors][ color[node] ] = bind_mode
|
339
|
+
end
|
340
|
+
|
341
|
+
# reverse mapping of the node occurences
|
342
|
+
@nodes[subject][:positions].push( { :clause => i, :role => :subject } )
|
343
|
+
@nodes[object][:positions].push( { :clause => i, :role => :object } )
|
344
|
+
|
345
|
+
if superp = map.subproperty_of
|
346
|
+
# link subproperty qualifier into the pattern
|
347
|
+
pattern.push(
|
348
|
+
[RdfPropertyMap.qualifier_property(superp), subject, predicate])
|
349
|
+
color[predicate] = color[object]
|
350
|
+
|
351
|
+
# no need to ground both subproperty and superproperty
|
352
|
+
@nodes[object][:ground] = true
|
353
|
+
end
|
354
|
+
|
355
|
+
else
|
356
|
+
# assume reification for unmapped predicates:
|
357
|
+
#
|
358
|
+
# | (rdf::predicate ?_stmt_#{i} p)
|
359
|
+
# (p s o) -> | (rdf::subject ?_stmt_#{i} s)
|
360
|
+
# | (rdf::object ?_stmt_#{i} o)
|
361
|
+
#
|
362
|
+
rdf = 'http://www.w3.org/1999/02/22-rdf-syntax-ns#'
|
363
|
+
stmt = "?_stmt_#{i}"
|
364
|
+
pattern.push([rdf + 'predicate', stmt, predicate],
|
365
|
+
[rdf + 'subject', stmt, subject],
|
366
|
+
[rdf + 'object', stmt, object])
|
367
|
+
color[stmt] = color[predicate] = color[object]
|
368
|
+
end
|
369
|
+
end
|
370
|
+
end
|
371
|
+
|
372
|
+
# Select strongest of the two bind modes, in the following order of
|
373
|
+
# preference:
|
374
|
+
#
|
375
|
+
# :must_bind -> :must_not_bind -> :may_bind
|
376
|
+
#
|
377
|
+
def stronger_bind_mode(mode1, mode2)
|
378
|
+
if mode1 != mode2 and (:must_bind == mode2 or :may_bind == mode1)
|
379
|
+
mode2
|
380
|
+
else
|
381
|
+
mode1
|
382
|
+
end
|
383
|
+
end
|
384
|
+
|
385
|
+
# If a node can be mapped to more than one [table, field] pair, see if it can
|
386
|
+
# be refined based on other occurences of this node in other query clauses.
|
387
|
+
#
|
388
|
+
def refine_ambiguous_properties
|
389
|
+
@nodes.each_value do |n|
|
390
|
+
map = n[:positions]
|
391
|
+
|
392
|
+
map.each_with_index do |p, i|
|
393
|
+
big = @clauses[ p[:clause] ][ p[:role] ]
|
394
|
+
next if big.size <= 1 # no refining needed
|
395
|
+
|
396
|
+
debug { n + ': ' + big.inspect }
|
397
|
+
|
398
|
+
(i + 1).upto(map.size - 1) do |j|
|
399
|
+
small_p = map[j]
|
400
|
+
small = @clauses[ small_p[:clause] ][ small_p[:role] ]
|
401
|
+
|
402
|
+
refined = big & small
|
403
|
+
if refined.size > 0 and refined.size < big.size
|
404
|
+
|
405
|
+
# refine the node...
|
406
|
+
@clauses[ p[:clause] ][ p[:role] ] = big = refined
|
407
|
+
|
408
|
+
# ...and its pair
|
409
|
+
@clauses[ p[:clause] ][ opposite_role(p[:role]) ].collect! {|pair|
|
410
|
+
refined.assoc(pair[0]) ? pair : nil
|
411
|
+
}.compact!
|
412
|
+
end
|
413
|
+
end
|
414
|
+
end
|
415
|
+
end
|
416
|
+
|
417
|
+
# drop remaining ambiguous mappings
|
418
|
+
# todo: split query for ambiguous mappings
|
419
|
+
@clauses.each do |clause|
|
420
|
+
next if clause.nil? # means it was reified
|
421
|
+
clause[:subject] = clause[:subject].first
|
422
|
+
clause[:object] = clause[:object].first
|
423
|
+
end
|
424
|
+
end
|
425
|
+
|
426
|
+
def opposite_role(role)
|
427
|
+
:subject == role ? :object : :subject
|
428
|
+
end
|
429
|
+
|
430
|
+
# Return current value of alias counter, remember which table it was assigned
|
431
|
+
# to, and increment the counter.
|
432
|
+
#
|
433
|
+
def next_alias(table, node, bind_mode = @nodes[node][:bind_mode])
|
434
|
+
@ac ||= 'a'
|
435
|
+
@aliases ||= {}
|
436
|
+
|
437
|
+
a = @ac.dup
|
438
|
+
@aliases[a] = {
|
439
|
+
:table => table,
|
440
|
+
:node => node,
|
441
|
+
:bind_mode => bind_mode,
|
442
|
+
:filter => []
|
443
|
+
}
|
444
|
+
|
445
|
+
@ac.next!
|
446
|
+
return a
|
447
|
+
end
|
448
|
+
|
449
|
+
def define_relation_aliases
|
450
|
+
@nodes.each do |node, n|
|
451
|
+
|
452
|
+
positions = n[:positions]
|
453
|
+
|
454
|
+
# go through all clauses with this node in subject position
|
455
|
+
positions.each_with_index do |p, i|
|
456
|
+
next if :subject != p[:role] or @clauses[ p[:clause] ][:alias]
|
457
|
+
|
458
|
+
clause = @clauses[ p[:clause] ]
|
459
|
+
map = clause[:map]
|
460
|
+
table = clause[:transitive] ? map.transitive_closure : map.table
|
461
|
+
|
462
|
+
# see if we've already mapped this node to the same table before
|
463
|
+
0.upto(i - 1) do |j|
|
464
|
+
similar_clause = @clauses[ positions[j][:clause] ]
|
465
|
+
|
466
|
+
if similar_clause[:alias] and
|
467
|
+
similar_clause[:map].table == table and
|
468
|
+
similar_clause[:map].field != map.field
|
469
|
+
# same node, same table, different field -> same alias
|
470
|
+
|
471
|
+
clause[:alias] = similar_clause[:alias]
|
472
|
+
break
|
473
|
+
end
|
474
|
+
end
|
475
|
+
|
476
|
+
if clause[:alias].nil?
|
477
|
+
clause[:alias] =
|
478
|
+
if clause[:transitive]
|
479
|
+
# transitive clause bind mode overrides a stronger node bind mode
|
480
|
+
#
|
481
|
+
# fixme: generic case for multiple aliases per node
|
482
|
+
next_alias(table, node, clause[:bind_mode])
|
483
|
+
else
|
484
|
+
next_alias(table, node)
|
485
|
+
end
|
486
|
+
end
|
487
|
+
end
|
488
|
+
end # optimize: unnecessary aliases are generated
|
489
|
+
end
|
490
|
+
|
491
|
+
def update_alias_filters
|
492
|
+
@clauses.each do |c|
|
493
|
+
if c[:filter]
|
494
|
+
@aliases[ c[:alias] ][:filter].push(c[:filter])
|
495
|
+
end
|
496
|
+
end
|
497
|
+
end
|
498
|
+
|
499
|
+
# Stage 2: Relation Aliases and Join Conditions (storage-impl.txt).
|
500
|
+
#
|
501
|
+
# Result is map of aliases in @aliases and list of join conditions in @jc.
|
502
|
+
#
|
503
|
+
def transform
|
504
|
+
define_relation_aliases
|
505
|
+
update_alias_filters
|
506
|
+
|
507
|
+
# [ [ binding1, binding2 ], ... ]
|
508
|
+
@jc = []
|
509
|
+
@bindings = {}
|
510
|
+
|
511
|
+
@nodes.each do |node, n|
|
512
|
+
positions = n[:positions]
|
513
|
+
|
514
|
+
# node binding
|
515
|
+
first = positions.first
|
516
|
+
clause = @clauses[ first[:clause] ]
|
517
|
+
a = clause[:alias]
|
518
|
+
binding = SqlNodeBinding.new(a, clause[ first[:role] ][:field])
|
519
|
+
@bindings[node] = [ binding ]
|
520
|
+
|
521
|
+
# join conditions
|
522
|
+
1.upto(positions.size - 1) do |i|
|
523
|
+
p = positions[i]
|
524
|
+
clause2 = @clauses[ p[:clause] ]
|
525
|
+
binding2 = SqlNodeBinding.new(clause2[:alias], clause2[ p[:role] ][:field])
|
526
|
+
|
527
|
+
unless @bindings[node].include?(binding2)
|
528
|
+
@bindings[node].push(binding2)
|
529
|
+
@jc.push([binding, binding2, node])
|
530
|
+
n[:ground] = true
|
531
|
+
end
|
532
|
+
end
|
533
|
+
|
534
|
+
# ground non-blank nodes
|
535
|
+
if node !~ SquishQuery::BN
|
536
|
+
|
537
|
+
if node =~ SquishQuery::INTERNAL # internal resource id
|
538
|
+
@aliases[a][:filter].push SqlExpression.new(binding, '=', $1)
|
539
|
+
|
540
|
+
elsif node =~ SquishQuery::PARAMETER or node =~ SquishQuery::LITERAL
|
541
|
+
@aliases[a][:filter].push SqlExpression.new(binding, '=', node)
|
542
|
+
|
543
|
+
elsif node =~ URI::URI_REF # external resource uriref
|
544
|
+
|
545
|
+
r = nil
|
546
|
+
positions.each do |p|
|
547
|
+
next unless :subject == p[:role]
|
548
|
+
|
549
|
+
c = @clauses[ p[:clause] ]
|
550
|
+
if 'resource' == c[:map].table
|
551
|
+
r = c[:alias] # reuse existing mapping to resource table
|
552
|
+
break
|
553
|
+
end
|
554
|
+
end
|
555
|
+
|
556
|
+
if r.nil?
|
557
|
+
r = next_alias('resource', node)
|
558
|
+
r_binding = SqlNodeBinding.new(r, 'id')
|
559
|
+
@bindings[node].unshift(r_binding)
|
560
|
+
@jc.push([ binding, r_binding, node ])
|
561
|
+
end
|
562
|
+
|
563
|
+
@aliases[r][:filter].push SqlExpression.new(
|
564
|
+
SqlNodeBinding.new(r, 'uriref'), '=', "'t'", 'AND',
|
565
|
+
SqlNodeBinding.new(r, 'label'), '=', %{'#{node}'})
|
566
|
+
|
567
|
+
else
|
568
|
+
raise RuntimeError,
|
569
|
+
"Invalid node '#{node}' should never occur at this point"
|
570
|
+
end
|
571
|
+
|
572
|
+
n[:ground] = true
|
573
|
+
end
|
574
|
+
end
|
575
|
+
|
576
|
+
debug do
|
577
|
+
@aliases.each {|alias_name, a| debug %{#{alias_name}: #{a.inspect}} }
|
578
|
+
@jc.each {|jc| debug jc.inspect }
|
579
|
+
end
|
580
|
+
end
|
581
|
+
|
582
|
+
# Produce SQL FROM and WHERE clauses from results of transform().
|
583
|
+
#
|
584
|
+
def generate_tables_and_conditions
|
585
|
+
main_path, seen = jc_subgraph_path(:must_bind)
|
586
|
+
debug { main_path.inspect }
|
587
|
+
|
588
|
+
main_path and not main_path.empty? or raise RuntimeError,
|
589
|
+
'Failed to find table aliases for main query'
|
590
|
+
|
591
|
+
@where = ground_dangling_blank_nodes(main_path)
|
592
|
+
|
593
|
+
joins = ''
|
594
|
+
subquery_count = 'a'
|
595
|
+
|
596
|
+
[ :must_not_bind, :may_bind ].each do |bind_mode|
|
597
|
+
loop do
|
598
|
+
sub_path, new = jc_subgraph_path(bind_mode, seen)
|
599
|
+
break if sub_path.nil? or sub_path.empty?
|
600
|
+
|
601
|
+
debug { sub_path.inspect }
|
602
|
+
|
603
|
+
sub_query, sub_join = sub_path.partition {|a,| main_path.assoc(a).nil? }
|
604
|
+
# fixme: make sure that sub_join is not empty
|
605
|
+
|
606
|
+
if 1 == sub_query.size
|
607
|
+
# simplified case: join single table directly without a subquery
|
608
|
+
join_alias, = sub_query.first
|
609
|
+
a = @aliases[join_alias]
|
610
|
+
join_target = a[:table]
|
611
|
+
join_conditions = jc_path_to_join_conditions(sub_join) + a[:filter]
|
612
|
+
|
613
|
+
else
|
614
|
+
# left join subquery to the main query
|
615
|
+
join_alias = '_subquery_' << subquery_count
|
616
|
+
subquery_count.next!
|
617
|
+
|
618
|
+
sub_join = subquery_jc_path(sub_join, join_alias)
|
619
|
+
rebind = rebind_subquery(sub_path, join_alias)
|
620
|
+
select_nodes = subquery_select_nodes(rebind, main_path, sub_join)
|
621
|
+
|
622
|
+
join_conditions = jc_path_to_join_conditions(sub_join, rebind,
|
623
|
+
select_nodes)
|
624
|
+
|
625
|
+
select_nodes = select_nodes.keys.collect {|b|
|
626
|
+
b.to_s << ' AS ' << rebind[b].field
|
627
|
+
}.join(', ')
|
628
|
+
|
629
|
+
tables, conditions = jc_path_to_tables_and_conditions(sub_path)
|
630
|
+
|
631
|
+
join_target = "(\nSELECT #{select_nodes}\nFROM #{tables}"
|
632
|
+
join_target << "\nWHERE " << conditions unless conditions.empty?
|
633
|
+
join_target << "\n)"
|
634
|
+
join_target.gsub!(/\n(?!\)\z)/, "\n ")
|
635
|
+
end
|
636
|
+
|
637
|
+
joins << ("\nLEFT JOIN " + join_target + ' AS ' + join_alias + ' ON ' +
|
638
|
+
join_conditions.uniq.join(' AND '))
|
639
|
+
|
640
|
+
if :must_not_bind == bind_mode
|
641
|
+
left_join_is_null(main_path, sub_join)
|
642
|
+
end
|
643
|
+
end
|
644
|
+
end
|
645
|
+
|
646
|
+
@from, main_where = jc_path_to_tables_and_conditions(main_path)
|
647
|
+
|
648
|
+
@from << joins
|
649
|
+
|
650
|
+
@where.push('(' + main_where + ')') unless main_where.empty?
|
651
|
+
@where.push('(' + @global_filter + ')') unless @global_filter.empty?
|
652
|
+
@where = @where.join("\nAND ")
|
653
|
+
end
|
654
|
+
|
655
|
+
# Produce a subgraph path through join conditions linking all aliases with
|
656
|
+
# given _bind_mode_ that form a same-color connected component of the join
|
657
|
+
# conditions graph and weren't processed yet:
|
658
|
+
#
|
659
|
+
# path = [ [start, []], [ next, [ jc, ... ] ], ... ]
|
660
|
+
#
|
661
|
+
# Update _seen_ hash for all aliases included in the produced path.
|
662
|
+
#
|
663
|
+
def jc_subgraph_path(bind_mode, seen = {})
|
664
|
+
start = find_alias(bind_mode, seen)
|
665
|
+
return nil if start.nil?
|
666
|
+
|
667
|
+
new = {}
|
668
|
+
new[start] = true
|
669
|
+
path = [ [start, []] ]
|
670
|
+
colors = @nodes[ @aliases[start][:node] ][:colors].keys
|
671
|
+
|
672
|
+
loop do # while we can find more connecting joins of the same color
|
673
|
+
join_alias = nil
|
674
|
+
|
675
|
+
@jc.each do |jc|
|
676
|
+
# use cases:
|
677
|
+
# - seen is empty (composing the must-bind join)
|
678
|
+
# - seen is not empty (composing a subquery)
|
679
|
+
|
680
|
+
next if (colors & @nodes[ jc[2] ][:colors].keys).empty?
|
681
|
+
|
682
|
+
0.upto(1) do |i|
|
683
|
+
a_seen = jc[i].alias
|
684
|
+
a_next = jc[1-i].alias
|
685
|
+
|
686
|
+
if not new[a_next] and (
|
687
|
+
((new[a_seen] or seen[a_seen]) and
|
688
|
+
(@aliases[a_next][:bind_mode] == bind_mode)
|
689
|
+
# connect an untouched node of matching bind mode
|
690
|
+
) or (
|
691
|
+
new[a_seen] and seen[a_next] and
|
692
|
+
# connect subquery to the rest of the query...
|
693
|
+
@aliases[a_seen][:bind_mode] == bind_mode
|
694
|
+
# ...but only go one step deep
|
695
|
+
))
|
696
|
+
|
697
|
+
join_alias = a_next
|
698
|
+
break
|
699
|
+
end
|
700
|
+
end
|
701
|
+
|
702
|
+
break if join_alias
|
703
|
+
end
|
704
|
+
|
705
|
+
break if join_alias.nil?
|
706
|
+
|
707
|
+
# join it to all seen aliases
|
708
|
+
join_on = @jc.find_all do |jc|
|
709
|
+
a1, a2 = jc[0, 2].collect {|b| b.alias }
|
710
|
+
(new[a1] and a2 == join_alias) or (new[a2] and a1 == join_alias)
|
711
|
+
end
|
712
|
+
|
713
|
+
new[join_alias] = true
|
714
|
+
path.push([join_alias, join_on])
|
715
|
+
end
|
716
|
+
|
717
|
+
seen.merge!(new)
|
718
|
+
[ path, new ]
|
719
|
+
end
|
720
|
+
|
721
|
+
def find_alias(bind_mode, seen = {})
|
722
|
+
@aliases.each do |alias_name, a|
|
723
|
+
next if seen[alias_name] or a[:bind_mode] != bind_mode
|
724
|
+
return alias_name
|
725
|
+
end
|
726
|
+
|
727
|
+
nil
|
728
|
+
end
|
729
|
+
|
730
|
+
# Ground all must-bind blank nodes that weren't ground elsewhere to an
|
731
|
+
# existential quantifier.
|
732
|
+
#
|
733
|
+
def ground_dangling_blank_nodes(main_path)
|
734
|
+
conditions = []
|
735
|
+
ground_nodes = @global_filter.scan(SquishQuery::BN_SCAN)
|
736
|
+
|
737
|
+
@nodes.each do |node, n|
|
738
|
+
next if (n[:ground] or ground_nodes.include?(node))
|
739
|
+
|
740
|
+
expression =
|
741
|
+
case n[:bind_mode]
|
742
|
+
when :must_bind
|
743
|
+
'IS NOT NULL'
|
744
|
+
when :must_not_bind
|
745
|
+
'IS NULL'
|
746
|
+
else
|
747
|
+
next
|
748
|
+
end
|
749
|
+
|
750
|
+
@bindings[node].each do |binding|
|
751
|
+
if main_path.assoc(binding.alias)
|
752
|
+
conditions.push SqlExpression.new(binding, expression)
|
753
|
+
break
|
754
|
+
end
|
755
|
+
end
|
756
|
+
end
|
757
|
+
|
758
|
+
conditions
|
759
|
+
end
|
760
|
+
|
761
|
+
# Join a subquery to the main query: for each alias shared between the two,
|
762
|
+
# link 'id' field of the corresponding table within and outside the subquery.
|
763
|
+
# If no node is bound to the 'id' field, create a virtual node bound to it,
|
764
|
+
# so that it can be rebound by rebind_subquery().
|
765
|
+
#
|
766
|
+
def subquery_jc_path(sub_join, join_alias)
|
767
|
+
sub_join.empty? and raise ProgrammingError,
|
768
|
+
"Unexpected empty subquery, check your RDF storage configuration"
|
769
|
+
# fixme: reify instead of raising an exception
|
770
|
+
|
771
|
+
sub_join.transpose[0].uniq.collect do |a|
|
772
|
+
binding = SqlNodeBinding.new(a, 'id')
|
773
|
+
|
774
|
+
exists = false
|
775
|
+
@nodes.each do |node, n|
|
776
|
+
if @bindings[node].include?(binding)
|
777
|
+
exists = true
|
778
|
+
break
|
779
|
+
end
|
780
|
+
end
|
781
|
+
|
782
|
+
unless exists
|
783
|
+
node = '?' + join_alias + '_' + a
|
784
|
+
@nodes[node] = { :ground => true }
|
785
|
+
@bindings[node] = [ binding ]
|
786
|
+
end
|
787
|
+
|
788
|
+
[ a, [[ binding, binding ]] ]
|
789
|
+
end
|
790
|
+
end
|
791
|
+
|
792
|
+
# Generate a hash that maps all bindings that's been wrapped inside the
|
793
|
+
# _sub_query_ (a jc path, see jc_subquery_path()) to rebound bindings based
|
794
|
+
# on the _join_alias_ so that they may still be used in the main query.
|
795
|
+
#
|
796
|
+
def rebind_subquery(sub_path, join_alias)
|
797
|
+
rebind = {}
|
798
|
+
field_count = 'a'
|
799
|
+
|
800
|
+
wrapped = {}
|
801
|
+
sub_path.each {|a,| wrapped[a] = true }
|
802
|
+
|
803
|
+
@nodes.each do |node, n|
|
804
|
+
@bindings[node].each do |b|
|
805
|
+
if wrapped[b.alias] and rebind[b].nil?
|
806
|
+
field = '_field_' << field_count
|
807
|
+
field_count.next!
|
808
|
+
rebind[b] = SqlNodeBinding.new(join_alias, field)
|
809
|
+
end
|
810
|
+
end
|
811
|
+
end
|
812
|
+
|
813
|
+
rebind
|
814
|
+
end
|
815
|
+
|
816
|
+
# Go through global filter, filters in the main query, and join conditions
|
817
|
+
# attaching the subquery to the main query, rebind the bindings for nodes
|
818
|
+
# wrapped inside the subquery, and return a hash with keys for all bindings
|
819
|
+
# that should be selected from the subquery.
|
820
|
+
#
|
821
|
+
def subquery_select_nodes(rebind, main_path, sub_join)
|
822
|
+
select_nodes = {}
|
823
|
+
|
824
|
+
# update the global filter
|
825
|
+
@nodes.each do |node, n|
|
826
|
+
if r = rebind[ @bindings[node].first ]
|
827
|
+
@global_filter.gsub!(/#{Regexp.escape(node)}\b/) do
|
828
|
+
select_nodes[ @bindings[node].first ] = true
|
829
|
+
r.to_s
|
830
|
+
end
|
831
|
+
end
|
832
|
+
end
|
833
|
+
|
834
|
+
# update filters in the main query
|
835
|
+
main_path.each do |a,|
|
836
|
+
next if sub_join.assoc(a)
|
837
|
+
|
838
|
+
@aliases[a][:filter].each do |f|
|
839
|
+
f.rebind!(rebind) do |b|
|
840
|
+
select_nodes[b] = true
|
841
|
+
end
|
842
|
+
end
|
843
|
+
end
|
844
|
+
|
845
|
+
# update the subquery join path
|
846
|
+
sub_join.each do |a, jcs|
|
847
|
+
jcs.each do |jc|
|
848
|
+
select_nodes[ jc[0] ] = true
|
849
|
+
jc[1] = rebind[ jc[1] ]
|
850
|
+
end
|
851
|
+
end
|
852
|
+
|
853
|
+
# fixme: update main SELECT list
|
854
|
+
select_nodes
|
855
|
+
end
|
856
|
+
|
857
|
+
# Transform jc path (see jc_subgraph_path()) into a list of join conditions.
|
858
|
+
# If _rebind_ and _select_nodes_ hashes are defined, conditions will be
|
859
|
+
# rebound accordingly, and _select_nodes_ will be updated to include bindings
|
860
|
+
# used in the conditions.
|
861
|
+
#
|
862
|
+
def jc_path_to_join_conditions(jc_path, rebind = nil, select_nodes = nil)
|
863
|
+
conditions = []
|
864
|
+
|
865
|
+
jc_path.each do |a, jcs|
|
866
|
+
jcs.each do |b1, b2, n|
|
867
|
+
conditions.push SqlExpression.new(b1, '=', b2)
|
868
|
+
end
|
869
|
+
end
|
870
|
+
|
871
|
+
conditions.empty? and raise RuntimeError,
|
872
|
+
"Failed to join subquery to the main query"
|
873
|
+
|
874
|
+
conditions
|
875
|
+
end
|
876
|
+
|
877
|
+
# Generate FROM and WHERE clauses from a jc path (see jc_subgraph_path()).
|
878
|
+
#
|
879
|
+
def jc_path_to_tables_and_conditions(path)
|
880
|
+
first, = path[0]
|
881
|
+
a = @aliases[first]
|
882
|
+
|
883
|
+
tables = a[:table] + ' AS ' + first
|
884
|
+
conditions = a[:filter]
|
885
|
+
|
886
|
+
path[1, path.size - 1].each do |join_alias, join_on|
|
887
|
+
a = @aliases[join_alias]
|
888
|
+
|
889
|
+
tables <<
|
890
|
+
%{\nINNER JOIN #{a[:table]} AS #{join_alias} ON } <<
|
891
|
+
(
|
892
|
+
join_on.collect {|b1, b2| SqlExpression.new(b1, '=', b2) } +
|
893
|
+
a[:filter]
|
894
|
+
).uniq.join(' AND ')
|
895
|
+
end
|
896
|
+
|
897
|
+
[ tables, conditions.uniq.join("\nAND ") ]
|
898
|
+
end
|
899
|
+
|
900
|
+
# Find and declare as NULL key fields of a must-not-bind subquery.
|
901
|
+
#
|
902
|
+
def left_join_is_null(main_path, sub_join)
|
903
|
+
sub_join.each do |a, jcs|
|
904
|
+
jcs.each do |jc|
|
905
|
+
0.upto(1) do |i|
|
906
|
+
if main_path.assoc(jc[i].alias).nil?
|
907
|
+
@where.push SqlExpression.new(jc[i], 'IS NULL')
|
908
|
+
break
|
909
|
+
end
|
910
|
+
end
|
911
|
+
end
|
912
|
+
end
|
913
|
+
end
|
914
|
+
end
|
915
|
+
|
916
|
+
end
|