graffiti 2.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- data/COPYING +676 -0
- data/ChangeLog.mtn +233 -0
- data/README.rdoc +129 -0
- data/TODO +30 -0
- data/doc/diagrams/graffiti-classes.svg +157 -0
- data/doc/diagrams/graffiti-deployment.svg +117 -0
- data/doc/diagrams/graffiti-store-sequence.svg +69 -0
- data/doc/diagrams/squish-select-sequence.svg +266 -0
- data/doc/examples/samizdat-rdf-config.yaml +77 -0
- data/doc/examples/samizdat-triggers-pgsql.sql +266 -0
- data/doc/papers/collreif.tex +462 -0
- data/doc/papers/rdf-to-relational-query-translation-icis2009.tex +936 -0
- data/doc/papers/rel-rdf.tex +545 -0
- data/doc/rdf-impl-report.txt +126 -0
- data/graffiti.gemspec +21 -0
- data/lib/graffiti.rb +15 -0
- data/lib/graffiti/debug.rb +34 -0
- data/lib/graffiti/exceptions.rb +20 -0
- data/lib/graffiti/rdf_config.rb +78 -0
- data/lib/graffiti/rdf_property_map.rb +92 -0
- data/lib/graffiti/sql_mapper.rb +916 -0
- data/lib/graffiti/squish.rb +568 -0
- data/lib/graffiti/store.rb +100 -0
- data/setup.rb +1360 -0
- data/test/ts_graffiti.rb +455 -0
- metadata +122 -0
@@ -0,0 +1,126 @@
|
|
1
|
+
Samizdat RDF Implementation Report
|
2
|
+
==================================
|
3
|
+
|
4
|
+
http://lists.w3.org/Archives/Public/www-rdf-interest/2003Sep/0043.html
|
5
|
+
|
6
|
+
Implementation
|
7
|
+
--------------
|
8
|
+
|
9
|
+
http://www.nongnu.org/samizdat/
|
10
|
+
|
11
|
+
Samizdat is a generic RDF-based engine for building collaboration and
|
12
|
+
open publishing web sites. Samizdat will let everyone publish, view,
|
13
|
+
comment, edit, and aggregate text and multimedia resources, vote on
|
14
|
+
ratings and classifications, filter resources by flexible sets of
|
15
|
+
criteria, cooperate and coordinate on all kinds of activities (see
|
16
|
+
Design Goals document). Samizdat intends to promote values of freedom,
|
17
|
+
openness, equality, and cooperation.
|
18
|
+
|
19
|
+
Samizdat engine is implemented using Ruby programming language, Apache
|
20
|
+
mod_ruby module, and PostgreSQL RDBMS, and is available under the GNU
|
21
|
+
General Public License, version 2 or later.
|
22
|
+
|
23
|
+
Project development started in December 2002, first public release was
|
24
|
+
announced in June 2003. This report refers to the Samizdat 0.0.4,
|
25
|
+
released on 2003-09-01.
|
26
|
+
|
27
|
+
Functionality covered by this version includes: registering site
|
28
|
+
members, publishing and replying to messages, uploading multimedia
|
29
|
+
messages, voting on standard tags on resources; hand-editing or using
|
30
|
+
GUI for constructing and publishing Squish queries that can be used to
|
31
|
+
search and filter site resources.
|
32
|
+
|
33
|
+
|
34
|
+
RDF Schema
|
35
|
+
----------
|
36
|
+
|
37
|
+
Samizdat defines its own RDF schema for description of site members,
|
38
|
+
published messages, votes, and other site resources (see Concepts
|
39
|
+
document). One of the outstanding features of Samizdat schema is the use
|
40
|
+
of statement reification in approval of content classification with
|
41
|
+
votes cast by site members.
|
42
|
+
|
43
|
+
Samizdat RDF schema uses Dublin Core metadata where applicable; also,
|
44
|
+
integration of site member descriptions with FOAF is planned.
|
45
|
+
|
46
|
+
One of the problems encountered in Samizdat RDF Schema development was
|
47
|
+
the lack of standard metadata describing discussion threads. While other
|
48
|
+
properties defined in Samizdat schema denote Samizdat-specific concepts,
|
49
|
+
such as "vote" and "rating", it is more desirable to use commonly agreed
|
50
|
+
metadata for threading structure in place of implementation-local
|
51
|
+
"thread" and "inReplyTo" properties.
|
52
|
+
|
53
|
+
|
54
|
+
RDF Import and Export
|
55
|
+
---------------------
|
56
|
+
|
57
|
+
While Samizdat model follows RDF Concepts and RDF Semantics
|
58
|
+
recommendations (with the exceptions put down below), the engine does
|
59
|
+
not externally interchange RDF data and thus does not use RDF/XML or
|
60
|
+
other RDF serialization format. It is assumed that, when the need for
|
61
|
+
RDF import and export arises, it can be implemented externally on top of
|
62
|
+
the Samizdat RDF storage module and using existing RDF frameworks such
|
63
|
+
as Redland.
|
64
|
+
|
65
|
+
|
66
|
+
Datatyped Literals
|
67
|
+
------------------
|
68
|
+
|
69
|
+
Samizdat doesn't implement datatyped literals, and relies on underlying
|
70
|
+
PostgreSQL capabilities for mapping between literal values and their
|
71
|
+
string representations. Outside of SQL context, literals are interpreted
|
72
|
+
as opaque strings; XML literals are not treated specially, and datatype
|
73
|
+
information is not preserved.
|
74
|
+
|
75
|
+
However, support of XML schema datatypes is considered necessary in
|
76
|
+
order to untie a Samizdat knowledge base from specifics of underlying
|
77
|
+
RDF storage, and will be implemented as a prerequisite for migration to
|
78
|
+
a selection of alternative RDF storage backends (candidates are FramerD,
|
79
|
+
3store, and Redland).
|
80
|
+
|
81
|
+
|
82
|
+
Language Tags
|
83
|
+
-------------
|
84
|
+
|
85
|
+
Literal language tags are not honoured, "dc:language" property is
|
86
|
+
supposed to be used to denote message language.
|
87
|
+
|
88
|
+
|
89
|
+
Entailments
|
90
|
+
-----------
|
91
|
+
|
92
|
+
Samizdat RDF storage only implements simple entailment, vocabulary
|
93
|
+
entailment is not implemented yet. At the moment, simple entailment
|
94
|
+
suffices for all features of the Samizdat engine. If and when vocabulary
|
95
|
+
entailment becomes necessary, it will be implemented in Samizdat RDF
|
96
|
+
storage module or relegated to an alternative RDF storage backend,
|
97
|
+
depending on status of backend alternatives for Samizdat at that time.
|
98
|
+
|
99
|
+
|
100
|
+
Query Support
|
101
|
+
-------------
|
102
|
+
|
103
|
+
Samizdat RDF storage implements a translation of RDF query graphs
|
104
|
+
written in extended Squish into relational SQL queries and allows purely
|
105
|
+
relational representation of selected properties of site resources (see
|
106
|
+
RDF Storage and Storage Implementation documents).
|
107
|
+
|
108
|
+
It must be noted that at the moment, status of RDF query language
|
109
|
+
standards is found unsatisfactory.
|
110
|
+
|
111
|
+
DAML Query Language abstract specification provides excellent formal
|
112
|
+
basis, but does not encompass all capabilities of existing RDF query
|
113
|
+
languages. Also, existing query languages are limited in one way or
|
114
|
+
another, are underformalized (most are defined by single
|
115
|
+
implementation), and often overloaded with baroque syntax.
|
116
|
+
|
117
|
+
Two major features that were missed the most in existing query languages
|
118
|
+
at the time of Samizdat RDF storage implementation were: knowledge base
|
119
|
+
update allowing to merge complex constructs into the site KB graph
|
120
|
+
(implemented in Samizdat RDF Data Manipulation Language), and workflow
|
121
|
+
control providing at least transaction rollback (in Samizdat, underlying
|
122
|
+
PostgreSQL transactions are used). Other Squish extensions implemented
|
123
|
+
in Samizdat are literal conditions and answer collection ordering
|
124
|
+
(currently, relegated to PostgreSQL; ideally, interpreted according to
|
125
|
+
literal datatypes).
|
126
|
+
|
data/graffiti.gemspec
ADDED
@@ -0,0 +1,21 @@
|
|
1
|
+
Gem::Specification.new do |spec|
|
2
|
+
spec.name = 'graffiti'
|
3
|
+
spec.version = '2.1'
|
4
|
+
spec.author = 'Dmitry Borodaenko'
|
5
|
+
spec.email = 'angdraug@debian.org'
|
6
|
+
spec.homepage = 'https://github.com/angdraug/graffiti'
|
7
|
+
spec.summary = 'Relational RDF store for Ruby'
|
8
|
+
spec.description = <<-EOF
|
9
|
+
Graffiti is an RDF store based on dynamic translation of RDF queries into SQL.
|
10
|
+
Graffiti allows one to map any relational database schema into RDF semantics
|
11
|
+
and vice versa, to store any RDF data in a relational database.
|
12
|
+
|
13
|
+
Graffiti uses Sequel to connect to database backend and provides a DBI-like
|
14
|
+
interface to run RDF queries in Squish query language from Ruby applications.
|
15
|
+
EOF
|
16
|
+
spec.files = `git ls-files`.split "\n"
|
17
|
+
spec.test_files = Dir['test/ts_*.rb']
|
18
|
+
spec.license = 'GPL3+'
|
19
|
+
spec.add_dependency('syncache')
|
20
|
+
spec.add_dependency('sequel')
|
21
|
+
end
|
data/lib/graffiti.rb
ADDED
@@ -0,0 +1,15 @@
|
|
1
|
+
# Graffiti RDF Store
|
2
|
+
# (originally written for Samizdat project)
|
3
|
+
#
|
4
|
+
# Copyright (c) 2002-2009 Dmitry Borodaenko <angdraug@debian.org>
|
5
|
+
#
|
6
|
+
# This program is free software.
|
7
|
+
# You can distribute/modify this program under the terms of
|
8
|
+
# the GNU General Public License version 3 or later.
|
9
|
+
#
|
10
|
+
# see doc/rdf-storage.txt for introduction and Graffiti Squish definition;
|
11
|
+
# see doc/storage-impl.txt for explanation of implemented algorithms
|
12
|
+
#
|
13
|
+
# vim: et sw=2 sts=2 ts=8 tw=0
|
14
|
+
|
15
|
+
require 'graffiti/store'
|
@@ -0,0 +1,34 @@
|
|
1
|
+
# Graffiti RDF Store
|
2
|
+
# (originally written for Samizdat project)
|
3
|
+
#
|
4
|
+
# Copyright (c) 2002-2011 Dmitry Borodaenko <angdraug@debian.org>
|
5
|
+
#
|
6
|
+
# This program is free software.
|
7
|
+
# You can distribute/modify this program under the terms of
|
8
|
+
# the GNU General Public License version 3 or later.
|
9
|
+
#
|
10
|
+
# see doc/rdf-storage.txt for introduction and Graffiti Squish definition;
|
11
|
+
# see doc/storage-impl.txt for explanation of implemented algorithms
|
12
|
+
#
|
13
|
+
# vim: et sw=2 sts=2 ts=8 tw=0
|
14
|
+
|
15
|
+
module Graffiti
|
16
|
+
|
17
|
+
module Debug
|
18
|
+
private
|
19
|
+
|
20
|
+
DEBUG = false
|
21
|
+
|
22
|
+
def debug(message = nil)
|
23
|
+
return unless DEBUG
|
24
|
+
|
25
|
+
log message if message
|
26
|
+
log yield if block_given?
|
27
|
+
end
|
28
|
+
|
29
|
+
def log(message)
|
30
|
+
STDERR << 'Graffiti: ' << message.to_s << "\n"
|
31
|
+
end
|
32
|
+
end
|
33
|
+
|
34
|
+
end
|
@@ -0,0 +1,20 @@
|
|
1
|
+
# Graffiti RDF Store
|
2
|
+
# (originally written for Samizdat project)
|
3
|
+
#
|
4
|
+
# Copyright (c) 2002-2009 Dmitry Borodaenko <angdraug@debian.org>
|
5
|
+
#
|
6
|
+
# This program is free software.
|
7
|
+
# You can distribute/modify this program under the terms of
|
8
|
+
# the GNU General Public License version 3 or later.
|
9
|
+
#
|
10
|
+
# see doc/rdf-storage.txt for introduction and Graffiti Squish definition;
|
11
|
+
# see doc/storage-impl.txt for explanation of implemented algorithms
|
12
|
+
#
|
13
|
+
# vim: et sw=2 sts=2 ts=8 tw=0
|
14
|
+
|
15
|
+
module Graffiti
|
16
|
+
|
17
|
+
# raised for syntax errors in Squish statements
|
18
|
+
class ProgrammingError < RuntimeError; end
|
19
|
+
|
20
|
+
end
|
@@ -0,0 +1,78 @@
|
|
1
|
+
# Graffiti RDF Store
|
2
|
+
# (originally written for Samizdat project)
|
3
|
+
#
|
4
|
+
# Copyright (c) 2002-2011 Dmitry Borodaenko <angdraug@debian.org>
|
5
|
+
#
|
6
|
+
# This program is free software.
|
7
|
+
# You can distribute/modify this program under the terms of
|
8
|
+
# the GNU General Public License version 3 or later.
|
9
|
+
#
|
10
|
+
# see doc/rdf-storage.txt for introduction and Graffiti Squish definition;
|
11
|
+
# see doc/storage-impl.txt for explanation of implemented algorithms
|
12
|
+
#
|
13
|
+
# vim: et sw=2 sts=2 ts=8 tw=0
|
14
|
+
|
15
|
+
require 'graffiti/rdf_property_map'
|
16
|
+
|
17
|
+
module Graffiti
|
18
|
+
|
19
|
+
# Configuration of relational RDF storage (see examples)
|
20
|
+
#
|
21
|
+
class RdfConfig
|
22
|
+
def initialize(config)
|
23
|
+
@ns = config['ns']
|
24
|
+
|
25
|
+
@map = {}
|
26
|
+
|
27
|
+
config['map'].each_pair do |p, m|
|
28
|
+
table, field = m.to_a.first
|
29
|
+
p = ns_expand(p)
|
30
|
+
@map[p] = RdfPropertyMap.new(p, table, field)
|
31
|
+
end
|
32
|
+
|
33
|
+
if config['subproperties'].kind_of? Hash
|
34
|
+
config['subproperties'].each_pair do |p, subproperties|
|
35
|
+
p = ns_expand(p)
|
36
|
+
map = @map[p] or raise RuntimeError,
|
37
|
+
"Incorrect RDF storage configuration: superproperty #{p} must be mapped"
|
38
|
+
map.superproperty = true
|
39
|
+
|
40
|
+
qualifier = RdfPropertyMap.qualifier_property(p)
|
41
|
+
@map[qualifier] = RdfPropertyMap.new(
|
42
|
+
qualifier, map.table, RdfPropertyMap.qualifier_field(map.field))
|
43
|
+
|
44
|
+
subproperties.each do |subp|
|
45
|
+
subp = ns_expand(subp)
|
46
|
+
@map[subp] = RdfPropertyMap.new(subp, map.table, map.field)
|
47
|
+
@map[subp].subproperty_of = p
|
48
|
+
end
|
49
|
+
end
|
50
|
+
end
|
51
|
+
|
52
|
+
if config['transitive_closure'].kind_of? Hash
|
53
|
+
config['transitive_closure'].each_pair do |p, table|
|
54
|
+
@map[ ns_expand(p) ].transitive_closure = table
|
55
|
+
|
56
|
+
if config['subproperties'].kind_of?(Hash) and config['subproperties'][p]
|
57
|
+
config['subproperties'][p].each do |subp|
|
58
|
+
@map[ ns_expand(subp) ].transitive_closure = table
|
59
|
+
end
|
60
|
+
end
|
61
|
+
end
|
62
|
+
end
|
63
|
+
end
|
64
|
+
|
65
|
+
# hash of namespaces
|
66
|
+
attr_reader :ns
|
67
|
+
|
68
|
+
# map internal property names with expanded namespaces to RdfPropertyMap
|
69
|
+
# objects
|
70
|
+
#
|
71
|
+
attr_reader :map
|
72
|
+
|
73
|
+
def ns_expand(p)
|
74
|
+
p and p.sub(/\A(\S+?)::/) { @ns[$1] }
|
75
|
+
end
|
76
|
+
end
|
77
|
+
|
78
|
+
end
|
@@ -0,0 +1,92 @@
|
|
1
|
+
# Graffiti RDF Store
|
2
|
+
# (originally written for Samizdat project)
|
3
|
+
#
|
4
|
+
# Copyright (c) 2002-2011 Dmitry Borodaenko <angdraug@debian.org>
|
5
|
+
#
|
6
|
+
# This program is free software.
|
7
|
+
# You can distribute/modify this program under the terms of
|
8
|
+
# the GNU General Public License version 3 or later.
|
9
|
+
#
|
10
|
+
# see doc/rdf-storage.txt for introduction and Graffiti Squish definition;
|
11
|
+
# see doc/storage-impl.txt for explanation of implemented algorithms
|
12
|
+
#
|
13
|
+
# vim: et sw=2 sts=2 ts=8 tw=0
|
14
|
+
|
15
|
+
module Graffiti
|
16
|
+
|
17
|
+
# Map of an internal RDF property into relational storage
|
18
|
+
#
|
19
|
+
class RdfPropertyMap
|
20
|
+
|
21
|
+
# special qualifier map
|
22
|
+
#
|
23
|
+
# ' ' is added to the property name to make sure it can't clash with any
|
24
|
+
# valid property uriref
|
25
|
+
#
|
26
|
+
def RdfPropertyMap.qualifier_property(property, type = 'subproperty')
|
27
|
+
property + ' ' + type
|
28
|
+
end
|
29
|
+
|
30
|
+
# special qualifier field
|
31
|
+
#
|
32
|
+
def RdfPropertyMap.qualifier_field(field, type = 'subproperty')
|
33
|
+
field + '_' + type
|
34
|
+
end
|
35
|
+
|
36
|
+
def initialize(property, table, field)
|
37
|
+
# fixme: support ambiguous mappings
|
38
|
+
@property = property
|
39
|
+
@table = table
|
40
|
+
@field = field
|
41
|
+
end
|
42
|
+
|
43
|
+
# expanded uriref of the mapped property
|
44
|
+
#
|
45
|
+
attr_reader :property
|
46
|
+
|
47
|
+
# name of the table into which the property is mapped (property domain is an
|
48
|
+
# internal resource class mapped into this table)
|
49
|
+
#
|
50
|
+
attr_reader :table
|
51
|
+
|
52
|
+
# name of the field into which the property is mapped
|
53
|
+
#
|
54
|
+
# if property range is not a literal, the field is a reference to the
|
55
|
+
# resource table
|
56
|
+
#
|
57
|
+
attr_reader :field
|
58
|
+
|
59
|
+
# expanded uriref of the property which this property is a subproperty of
|
60
|
+
#
|
61
|
+
# if set, this property maps into the same table and field as its
|
62
|
+
# superproperty, and is qualified by an additional field named
|
63
|
+
# <field>_subproperty which refers to a uriref resource holding uriref of
|
64
|
+
# this subproperty
|
65
|
+
#
|
66
|
+
attr_accessor :subproperty_of
|
67
|
+
|
68
|
+
attr_writer :superproperty
|
69
|
+
|
70
|
+
# set to +true+ if this property has subproperties
|
71
|
+
#
|
72
|
+
def superproperty?
|
73
|
+
@superproperty or false
|
74
|
+
end
|
75
|
+
|
76
|
+
# name of transitive closure table for a transitive property
|
77
|
+
#
|
78
|
+
# the format of a transitive closure table is:
|
79
|
+
#
|
80
|
+
# - 'resource' field refers to the subject resource id
|
81
|
+
# - '<field>' property field and '<field>_subproperty' qualifier field (in
|
82
|
+
# case of subproperty) have the same name as in the main table
|
83
|
+
# - 'distance' field holds the distance from subject to object in the RDF
|
84
|
+
# graph
|
85
|
+
#
|
86
|
+
# the transitive closure table is automatically updated by a trigger on every
|
87
|
+
# update of the main table
|
88
|
+
#
|
89
|
+
attr_accessor :transitive_closure
|
90
|
+
end
|
91
|
+
|
92
|
+
end
|
@@ -0,0 +1,916 @@
|
|
1
|
+
# Graffiti RDF Store
|
2
|
+
# (originally written for Samizdat project)
|
3
|
+
#
|
4
|
+
# Copyright (c) 2002-2011 Dmitry Borodaenko <angdraug@debian.org>
|
5
|
+
#
|
6
|
+
# This program is free software.
|
7
|
+
# You can distribute/modify this program under the terms of
|
8
|
+
# the GNU General Public License version 3 or later.
|
9
|
+
#
|
10
|
+
# see doc/rdf-storage.txt for introduction and Graffiti Squish definition;
|
11
|
+
# see doc/storage-impl.txt for explanation of implemented algorithms
|
12
|
+
#
|
13
|
+
# vim: et sw=2 sts=2 ts=8 tw=0
|
14
|
+
|
15
|
+
require 'delegate'
|
16
|
+
require 'uri/common'
|
17
|
+
require 'graffiti/rdf_property_map'
|
18
|
+
require 'graffiti/squish'
|
19
|
+
|
20
|
+
module Graffiti
|
21
|
+
|
22
|
+
class SqlNodeBinding
|
23
|
+
def initialize(table_alias, field)
|
24
|
+
@alias = table_alias
|
25
|
+
@field = field
|
26
|
+
end
|
27
|
+
|
28
|
+
attr_reader :alias, :field
|
29
|
+
|
30
|
+
def to_s
|
31
|
+
@alias + '.' + @field
|
32
|
+
end
|
33
|
+
|
34
|
+
alias :inspect :to_s
|
35
|
+
|
36
|
+
def eql?(binding)
|
37
|
+
@alias == binding.alias and @field == binding.field
|
38
|
+
end
|
39
|
+
|
40
|
+
alias :'==' :eql?
|
41
|
+
|
42
|
+
def hash
|
43
|
+
self.to_s.hash
|
44
|
+
end
|
45
|
+
end
|
46
|
+
|
47
|
+
|
48
|
+
class SqlExpression < DelegateClass(Array)
|
49
|
+
def initialize(*parts)
|
50
|
+
super parts
|
51
|
+
end
|
52
|
+
|
53
|
+
def to_s
|
54
|
+
'(' << self.join(' ') << ')'
|
55
|
+
end
|
56
|
+
|
57
|
+
alias :to_str :to_s
|
58
|
+
|
59
|
+
def traverse(&block)
|
60
|
+
self.each do |part|
|
61
|
+
case part
|
62
|
+
when SqlExpression
|
63
|
+
part.traverse(&block)
|
64
|
+
else
|
65
|
+
yield
|
66
|
+
end
|
67
|
+
end
|
68
|
+
end
|
69
|
+
|
70
|
+
def rebind!(rebind, &block)
|
71
|
+
self.each_with_index do |part, i|
|
72
|
+
case part
|
73
|
+
when SqlExpression
|
74
|
+
part.rebind!(rebind, &block)
|
75
|
+
when SqlNodeBinding
|
76
|
+
if rebind[part]
|
77
|
+
self[i] = rebind[part]
|
78
|
+
yield part if block_given?
|
79
|
+
end
|
80
|
+
end
|
81
|
+
end
|
82
|
+
end
|
83
|
+
|
84
|
+
alias :eql? :'=='
|
85
|
+
|
86
|
+
def hash
|
87
|
+
self.to_s.hash
|
88
|
+
end
|
89
|
+
end
|
90
|
+
|
91
|
+
|
92
|
+
# Transform RDF query pattern graph into a relational join expression.
|
93
|
+
#
|
94
|
+
class SqlMapper
|
95
|
+
include Debug
|
96
|
+
|
97
|
+
def initialize(config, pattern, negative = [], optional = [], global_filter = '')
|
98
|
+
@config = config
|
99
|
+
@global_filter = global_filter
|
100
|
+
|
101
|
+
check_graph(pattern)
|
102
|
+
negative.empty? or check_graph(pattern + negative)
|
103
|
+
optional.empty? or check_graph(pattern + optional)
|
104
|
+
|
105
|
+
map_predicates(pattern, negative, optional)
|
106
|
+
transform
|
107
|
+
generate_tables_and_conditions
|
108
|
+
|
109
|
+
@jc = @aliases = @ac = @global_filter = nil
|
110
|
+
end
|
111
|
+
|
112
|
+
# map clause position to table, field, and table alias
|
113
|
+
#
|
114
|
+
# position => {
|
115
|
+
# :subject => {
|
116
|
+
# :node => node,
|
117
|
+
# :field => field
|
118
|
+
# },
|
119
|
+
# :object => {
|
120
|
+
# :node => node,
|
121
|
+
# :field => field
|
122
|
+
# },
|
123
|
+
# :map => RdfPropertyMap,
|
124
|
+
# :bind_mode => < :must_bind | :may_bind | :must_not_bind >,
|
125
|
+
# :alias => alias
|
126
|
+
# }
|
127
|
+
#
|
128
|
+
attr_reader :clauses
|
129
|
+
|
130
|
+
# map node to list of positions in clauses
|
131
|
+
#
|
132
|
+
# node => {
|
133
|
+
# :positions => [
|
134
|
+
# { :clause => position, :role => < :subject | :object > }
|
135
|
+
# ],
|
136
|
+
# :bind_mode => < :must_bind | :may_bind | :must_not_bind >,
|
137
|
+
# :colors => { color1 => bind_mode1, ... },
|
138
|
+
# :ground => < true | false >
|
139
|
+
# }
|
140
|
+
#
|
141
|
+
attr_reader :nodes
|
142
|
+
|
143
|
+
# list of tables for FROM clause of SQL query
|
144
|
+
attr_reader :from
|
145
|
+
|
146
|
+
# conditions for WHERE clause of SQL query
|
147
|
+
attr_reader :where
|
148
|
+
|
149
|
+
# return node's binding, raise exception if the node isn't bound
|
150
|
+
#
|
151
|
+
def bind(node)
|
152
|
+
(@nodes[node] and @bindings[node] and (binding = @bindings[node].first)
|
153
|
+
) or raise ProgrammingError,
|
154
|
+
"Node '#{node}' is not bound by the query pattern"
|
155
|
+
|
156
|
+
@nodes[node][:positions].each do |p|
|
157
|
+
if :object == p[:role] and @clauses[ p[:clause] ][:map].subproperty_of
|
158
|
+
|
159
|
+
property = @clauses[ p[:clause] ][:map].property
|
160
|
+
return %{select_subproperty(#{binding}, #{bind(property)})}
|
161
|
+
end
|
162
|
+
end
|
163
|
+
|
164
|
+
binding
|
165
|
+
end
|
166
|
+
|
167
|
+
private
|
168
|
+
|
169
|
+
# Check whether pattern is not a disjoint graph (all nodes are
|
170
|
+
# undirectionally reachable from one node).
|
171
|
+
#
|
172
|
+
def check_graph(pattern)
|
173
|
+
nodes = pattern.transpose[1, 2].flatten.uniq # all nodes
|
174
|
+
|
175
|
+
seen = [ nodes.shift ]
|
176
|
+
found_more = true
|
177
|
+
|
178
|
+
while found_more and not nodes.empty?
|
179
|
+
found_more = false
|
180
|
+
|
181
|
+
pattern.each do |predicate, subject, object|
|
182
|
+
|
183
|
+
if seen.include?(subject) and nodes.include?(object)
|
184
|
+
seen.push(object)
|
185
|
+
nodes.delete(object)
|
186
|
+
found_more = true
|
187
|
+
|
188
|
+
elsif seen.include?(object) and nodes.include?(subject)
|
189
|
+
seen.push(subject)
|
190
|
+
nodes.delete(subject)
|
191
|
+
found_more = true
|
192
|
+
end
|
193
|
+
end
|
194
|
+
end
|
195
|
+
|
196
|
+
nodes.empty? or raise ProgrammingError, "Query pattern is a disjoint graph"
|
197
|
+
end
|
198
|
+
|
199
|
+
# Stage 1: Predicate Mapping (storage-impl.txt).
|
200
|
+
#
|
201
|
+
def map_predicates(pattern, negative, optional)
|
202
|
+
@nodes = {}
|
203
|
+
@clauses = []
|
204
|
+
|
205
|
+
map_pattern(pattern, :must_bind)
|
206
|
+
map_pattern(negative, :must_not_bind)
|
207
|
+
map_pattern(optional, :may_bind)
|
208
|
+
|
209
|
+
@color_counter = @must_bind_nodes = nil
|
210
|
+
|
211
|
+
refine_ambiguous_properties
|
212
|
+
|
213
|
+
debug do
|
214
|
+
@nodes.each do |node, n|
|
215
|
+
debug %{#{node}: #{n[:bind_mode]} #{n[:colors].inspect}}
|
216
|
+
end
|
217
|
+
end
|
218
|
+
end
|
219
|
+
|
220
|
+
# Label every connected component of the pattern with a different color.
|
221
|
+
#
|
222
|
+
# Pattern clause positions:
|
223
|
+
#
|
224
|
+
# 0. predicate
|
225
|
+
# 1. subject
|
226
|
+
# 2. object
|
227
|
+
# 3. filter
|
228
|
+
#
|
229
|
+
# Returns hash of node colors.
|
230
|
+
#
|
231
|
+
# Implements the {Two-pass Connected Component Labeling algorithm}
|
232
|
+
# [http://en.wikipedia.org/wiki/Connected_Component_Labeling#Two-pass]
|
233
|
+
# with an added special case to exclude _alien_nodes_ from neighbor lists.
|
234
|
+
#
|
235
|
+
# The special case ensures that parts of a may-bind or must-not-bind
|
236
|
+
# subpattern that are only connected through a must-bind node do not connect.
|
237
|
+
#
|
238
|
+
def label_pattern_components(pattern, alien_nodes, augment_alien_nodes = false)
|
239
|
+
return {} if pattern.empty?
|
240
|
+
|
241
|
+
color = {}
|
242
|
+
color_eq = [] # [ [ smaller, larger ], ... ]
|
243
|
+
nodes = pattern.transpose[1, 2].flatten.uniq
|
244
|
+
alien_nodes_here = nodes & alien_nodes
|
245
|
+
|
246
|
+
@color_counter = @color_counter ? @color_counter.next : 0
|
247
|
+
color[ nodes[0] ] = @color_counter
|
248
|
+
|
249
|
+
# first pass
|
250
|
+
1.upto(nodes.size - 1) do |i|
|
251
|
+
node = nodes[i]
|
252
|
+
|
253
|
+
pattern.each do |predicate, subject, object, filter|
|
254
|
+
if node == subject
|
255
|
+
neighbor = object
|
256
|
+
elsif node == object
|
257
|
+
neighbor = subject
|
258
|
+
end
|
259
|
+
next if neighbor.nil? or color[neighbor].nil? or
|
260
|
+
alien_nodes_here.include?(neighbor)
|
261
|
+
|
262
|
+
if color[node].nil?
|
263
|
+
color[node] = color[neighbor]
|
264
|
+
elsif color[node] != color[neighbor] # record color equivalence
|
265
|
+
color_eq |= [ [ color[node], color[neighbor] ].sort ]
|
266
|
+
end
|
267
|
+
end
|
268
|
+
|
269
|
+
color[node] ||= (@color_counter += 1)
|
270
|
+
end
|
271
|
+
|
272
|
+
# second pass
|
273
|
+
nodes.each do |node|
|
274
|
+
while eq = color_eq.rassoc(color[node])
|
275
|
+
color[node] = eq[0]
|
276
|
+
end
|
277
|
+
end
|
278
|
+
|
279
|
+
alien_nodes.push(*nodes).uniq! if augment_alien_nodes
|
280
|
+
|
281
|
+
color
|
282
|
+
end
|
283
|
+
|
284
|
+
def map_pattern(pattern, bind_mode = :must_bind)
|
285
|
+
pattern = pattern.dup
|
286
|
+
@must_bind_nodes ||= []
|
287
|
+
color = label_pattern_components(pattern, @must_bind_nodes, :must_bind == bind_mode)
|
288
|
+
|
289
|
+
pattern.each do |predicate, subject, object, filter, transitive|
|
290
|
+
|
291
|
+
# validate the triple
|
292
|
+
predicate =~ URI::URI_REF or raise ProgrammingError,
|
293
|
+
"Valid uriref expected in predicate position instead of '#{predicate}'"
|
294
|
+
|
295
|
+
[subject, object].each do |node|
|
296
|
+
node =~ SquishQuery::INTERNAL or
|
297
|
+
node =~ SquishQuery::BN or
|
298
|
+
node =~ URI::URI_REF or
|
299
|
+
raise ProgrammingError,
|
300
|
+
"Resource or blank node name expected instead of '#{node}'"
|
301
|
+
end
|
302
|
+
|
303
|
+
# list of possible mappings into internal tables
|
304
|
+
map = @config.map[predicate]
|
305
|
+
|
306
|
+
if transitive and map.transitive_closure.nil?
|
307
|
+
raise ProgrammingError,
|
308
|
+
"No transitive closure is defined for #{predicate} property"
|
309
|
+
end
|
310
|
+
|
311
|
+
if map and
|
312
|
+
(subject =~ SquishQuery::BN or
|
313
|
+
subject =~ SquishQuery::INTERNAL or
|
314
|
+
subject =~ SquishQuery::PARAMETER or
|
315
|
+
'resource' == map.table)
|
316
|
+
# internal predicate and subject is mappable to resource table
|
317
|
+
|
318
|
+
i = clauses.size
|
319
|
+
|
320
|
+
@clauses[i] = {
|
321
|
+
:subject => [ { :node => subject, :field => 'id' } ],
|
322
|
+
:object => [ { :node => object, :field => map.field } ],
|
323
|
+
:map => map,
|
324
|
+
:transitive => transitive,
|
325
|
+
:bind_mode => bind_mode
|
326
|
+
}
|
327
|
+
@clauses[i][:filter] = SqlExpression.new(filter) if filter
|
328
|
+
|
329
|
+
[subject, object].each do |node|
|
330
|
+
if @nodes[node]
|
331
|
+
@nodes[node][:bind_mode] =
|
332
|
+
stronger_bind_mode(@nodes[node][:bind_mode], bind_mode)
|
333
|
+
else
|
334
|
+
@nodes[node] = { :positions => [], :bind_mode => bind_mode, :colors => {} }
|
335
|
+
end
|
336
|
+
|
337
|
+
# set of node colors, one for each bind_mode
|
338
|
+
@nodes[node][:colors][ color[node] ] = bind_mode
|
339
|
+
end
|
340
|
+
|
341
|
+
# reverse mapping of the node occurences
|
342
|
+
@nodes[subject][:positions].push( { :clause => i, :role => :subject } )
|
343
|
+
@nodes[object][:positions].push( { :clause => i, :role => :object } )
|
344
|
+
|
345
|
+
if superp = map.subproperty_of
|
346
|
+
# link subproperty qualifier into the pattern
|
347
|
+
pattern.push(
|
348
|
+
[RdfPropertyMap.qualifier_property(superp), subject, predicate])
|
349
|
+
color[predicate] = color[object]
|
350
|
+
|
351
|
+
# no need to ground both subproperty and superproperty
|
352
|
+
@nodes[object][:ground] = true
|
353
|
+
end
|
354
|
+
|
355
|
+
else
|
356
|
+
# assume reification for unmapped predicates:
|
357
|
+
#
|
358
|
+
# | (rdf::predicate ?_stmt_#{i} p)
|
359
|
+
# (p s o) -> | (rdf::subject ?_stmt_#{i} s)
|
360
|
+
# | (rdf::object ?_stmt_#{i} o)
|
361
|
+
#
|
362
|
+
rdf = 'http://www.w3.org/1999/02/22-rdf-syntax-ns#'
|
363
|
+
stmt = "?_stmt_#{i}"
|
364
|
+
pattern.push([rdf + 'predicate', stmt, predicate],
|
365
|
+
[rdf + 'subject', stmt, subject],
|
366
|
+
[rdf + 'object', stmt, object])
|
367
|
+
color[stmt] = color[predicate] = color[object]
|
368
|
+
end
|
369
|
+
end
|
370
|
+
end
|
371
|
+
|
372
|
+
# Select strongest of the two bind modes, in the following order of
|
373
|
+
# preference:
|
374
|
+
#
|
375
|
+
# :must_bind -> :must_not_bind -> :may_bind
|
376
|
+
#
|
377
|
+
def stronger_bind_mode(mode1, mode2)
|
378
|
+
if mode1 != mode2 and (:must_bind == mode2 or :may_bind == mode1)
|
379
|
+
mode2
|
380
|
+
else
|
381
|
+
mode1
|
382
|
+
end
|
383
|
+
end
|
384
|
+
|
385
|
+
# If a node can be mapped to more than one [table, field] pair, see if it can
|
386
|
+
# be refined based on other occurences of this node in other query clauses.
|
387
|
+
#
|
388
|
+
def refine_ambiguous_properties
|
389
|
+
@nodes.each_value do |n|
|
390
|
+
map = n[:positions]
|
391
|
+
|
392
|
+
map.each_with_index do |p, i|
|
393
|
+
big = @clauses[ p[:clause] ][ p[:role] ]
|
394
|
+
next if big.size <= 1 # no refining needed
|
395
|
+
|
396
|
+
debug { n + ': ' + big.inspect }
|
397
|
+
|
398
|
+
(i + 1).upto(map.size - 1) do |j|
|
399
|
+
small_p = map[j]
|
400
|
+
small = @clauses[ small_p[:clause] ][ small_p[:role] ]
|
401
|
+
|
402
|
+
refined = big & small
|
403
|
+
if refined.size > 0 and refined.size < big.size
|
404
|
+
|
405
|
+
# refine the node...
|
406
|
+
@clauses[ p[:clause] ][ p[:role] ] = big = refined
|
407
|
+
|
408
|
+
# ...and its pair
|
409
|
+
@clauses[ p[:clause] ][ opposite_role(p[:role]) ].collect! {|pair|
|
410
|
+
refined.assoc(pair[0]) ? pair : nil
|
411
|
+
}.compact!
|
412
|
+
end
|
413
|
+
end
|
414
|
+
end
|
415
|
+
end
|
416
|
+
|
417
|
+
# drop remaining ambiguous mappings
|
418
|
+
# todo: split query for ambiguous mappings
|
419
|
+
@clauses.each do |clause|
|
420
|
+
next if clause.nil? # means it was reified
|
421
|
+
clause[:subject] = clause[:subject].first
|
422
|
+
clause[:object] = clause[:object].first
|
423
|
+
end
|
424
|
+
end
|
425
|
+
|
426
|
+
def opposite_role(role)
|
427
|
+
:subject == role ? :object : :subject
|
428
|
+
end
|
429
|
+
|
430
|
+
# Return current value of alias counter, remember which table it was assigned
|
431
|
+
# to, and increment the counter.
|
432
|
+
#
|
433
|
+
def next_alias(table, node, bind_mode = @nodes[node][:bind_mode])
|
434
|
+
@ac ||= 'a'
|
435
|
+
@aliases ||= {}
|
436
|
+
|
437
|
+
a = @ac.dup
|
438
|
+
@aliases[a] = {
|
439
|
+
:table => table,
|
440
|
+
:node => node,
|
441
|
+
:bind_mode => bind_mode,
|
442
|
+
:filter => []
|
443
|
+
}
|
444
|
+
|
445
|
+
@ac.next!
|
446
|
+
return a
|
447
|
+
end
|
448
|
+
|
449
|
+
def define_relation_aliases
|
450
|
+
@nodes.each do |node, n|
|
451
|
+
|
452
|
+
positions = n[:positions]
|
453
|
+
|
454
|
+
# go through all clauses with this node in subject position
|
455
|
+
positions.each_with_index do |p, i|
|
456
|
+
next if :subject != p[:role] or @clauses[ p[:clause] ][:alias]
|
457
|
+
|
458
|
+
clause = @clauses[ p[:clause] ]
|
459
|
+
map = clause[:map]
|
460
|
+
table = clause[:transitive] ? map.transitive_closure : map.table
|
461
|
+
|
462
|
+
# see if we've already mapped this node to the same table before
|
463
|
+
0.upto(i - 1) do |j|
|
464
|
+
similar_clause = @clauses[ positions[j][:clause] ]
|
465
|
+
|
466
|
+
if similar_clause[:alias] and
|
467
|
+
similar_clause[:map].table == table and
|
468
|
+
similar_clause[:map].field != map.field
|
469
|
+
# same node, same table, different field -> same alias
|
470
|
+
|
471
|
+
clause[:alias] = similar_clause[:alias]
|
472
|
+
break
|
473
|
+
end
|
474
|
+
end
|
475
|
+
|
476
|
+
if clause[:alias].nil?
|
477
|
+
clause[:alias] =
|
478
|
+
if clause[:transitive]
|
479
|
+
# transitive clause bind mode overrides a stronger node bind mode
|
480
|
+
#
|
481
|
+
# fixme: generic case for multiple aliases per node
|
482
|
+
next_alias(table, node, clause[:bind_mode])
|
483
|
+
else
|
484
|
+
next_alias(table, node)
|
485
|
+
end
|
486
|
+
end
|
487
|
+
end
|
488
|
+
end # optimize: unnecessary aliases are generated
|
489
|
+
end
|
490
|
+
|
491
|
+
def update_alias_filters
|
492
|
+
@clauses.each do |c|
|
493
|
+
if c[:filter]
|
494
|
+
@aliases[ c[:alias] ][:filter].push(c[:filter])
|
495
|
+
end
|
496
|
+
end
|
497
|
+
end
|
498
|
+
|
499
|
+
# Stage 2: Relation Aliases and Join Conditions (storage-impl.txt).
|
500
|
+
#
|
501
|
+
# Result is map of aliases in @aliases and list of join conditions in @jc.
|
502
|
+
#
|
503
|
+
def transform
|
504
|
+
define_relation_aliases
|
505
|
+
update_alias_filters
|
506
|
+
|
507
|
+
# [ [ binding1, binding2 ], ... ]
|
508
|
+
@jc = []
|
509
|
+
@bindings = {}
|
510
|
+
|
511
|
+
@nodes.each do |node, n|
|
512
|
+
positions = n[:positions]
|
513
|
+
|
514
|
+
# node binding
|
515
|
+
first = positions.first
|
516
|
+
clause = @clauses[ first[:clause] ]
|
517
|
+
a = clause[:alias]
|
518
|
+
binding = SqlNodeBinding.new(a, clause[ first[:role] ][:field])
|
519
|
+
@bindings[node] = [ binding ]
|
520
|
+
|
521
|
+
# join conditions
|
522
|
+
1.upto(positions.size - 1) do |i|
|
523
|
+
p = positions[i]
|
524
|
+
clause2 = @clauses[ p[:clause] ]
|
525
|
+
binding2 = SqlNodeBinding.new(clause2[:alias], clause2[ p[:role] ][:field])
|
526
|
+
|
527
|
+
unless @bindings[node].include?(binding2)
|
528
|
+
@bindings[node].push(binding2)
|
529
|
+
@jc.push([binding, binding2, node])
|
530
|
+
n[:ground] = true
|
531
|
+
end
|
532
|
+
end
|
533
|
+
|
534
|
+
# ground non-blank nodes
|
535
|
+
if node !~ SquishQuery::BN
|
536
|
+
|
537
|
+
if node =~ SquishQuery::INTERNAL # internal resource id
|
538
|
+
@aliases[a][:filter].push SqlExpression.new(binding, '=', $1)
|
539
|
+
|
540
|
+
elsif node =~ SquishQuery::PARAMETER or node =~ SquishQuery::LITERAL
|
541
|
+
@aliases[a][:filter].push SqlExpression.new(binding, '=', node)
|
542
|
+
|
543
|
+
elsif node =~ URI::URI_REF # external resource uriref
|
544
|
+
|
545
|
+
r = nil
|
546
|
+
positions.each do |p|
|
547
|
+
next unless :subject == p[:role]
|
548
|
+
|
549
|
+
c = @clauses[ p[:clause] ]
|
550
|
+
if 'resource' == c[:map].table
|
551
|
+
r = c[:alias] # reuse existing mapping to resource table
|
552
|
+
break
|
553
|
+
end
|
554
|
+
end
|
555
|
+
|
556
|
+
if r.nil?
|
557
|
+
r = next_alias('resource', node)
|
558
|
+
r_binding = SqlNodeBinding.new(r, 'id')
|
559
|
+
@bindings[node].unshift(r_binding)
|
560
|
+
@jc.push([ binding, r_binding, node ])
|
561
|
+
end
|
562
|
+
|
563
|
+
@aliases[r][:filter].push SqlExpression.new(
|
564
|
+
SqlNodeBinding.new(r, 'uriref'), '=', "'t'", 'AND',
|
565
|
+
SqlNodeBinding.new(r, 'label'), '=', %{'#{node}'})
|
566
|
+
|
567
|
+
else
|
568
|
+
raise RuntimeError,
|
569
|
+
"Invalid node '#{node}' should never occur at this point"
|
570
|
+
end
|
571
|
+
|
572
|
+
n[:ground] = true
|
573
|
+
end
|
574
|
+
end
|
575
|
+
|
576
|
+
debug do
|
577
|
+
@aliases.each {|alias_name, a| debug %{#{alias_name}: #{a.inspect}} }
|
578
|
+
@jc.each {|jc| debug jc.inspect }
|
579
|
+
end
|
580
|
+
end
|
581
|
+
|
582
|
+
# Produce SQL FROM and WHERE clauses from results of transform().
|
583
|
+
#
|
584
|
+
def generate_tables_and_conditions
|
585
|
+
main_path, seen = jc_subgraph_path(:must_bind)
|
586
|
+
debug { main_path.inspect }
|
587
|
+
|
588
|
+
main_path and not main_path.empty? or raise RuntimeError,
|
589
|
+
'Failed to find table aliases for main query'
|
590
|
+
|
591
|
+
@where = ground_dangling_blank_nodes(main_path)
|
592
|
+
|
593
|
+
joins = ''
|
594
|
+
subquery_count = 'a'
|
595
|
+
|
596
|
+
[ :must_not_bind, :may_bind ].each do |bind_mode|
|
597
|
+
loop do
|
598
|
+
sub_path, new = jc_subgraph_path(bind_mode, seen)
|
599
|
+
break if sub_path.nil? or sub_path.empty?
|
600
|
+
|
601
|
+
debug { sub_path.inspect }
|
602
|
+
|
603
|
+
sub_query, sub_join = sub_path.partition {|a,| main_path.assoc(a).nil? }
|
604
|
+
# fixme: make sure that sub_join is not empty
|
605
|
+
|
606
|
+
if 1 == sub_query.size
|
607
|
+
# simplified case: join single table directly without a subquery
|
608
|
+
join_alias, = sub_query.first
|
609
|
+
a = @aliases[join_alias]
|
610
|
+
join_target = a[:table]
|
611
|
+
join_conditions = jc_path_to_join_conditions(sub_join) + a[:filter]
|
612
|
+
|
613
|
+
else
|
614
|
+
# left join subquery to the main query
|
615
|
+
join_alias = '_subquery_' << subquery_count
|
616
|
+
subquery_count.next!
|
617
|
+
|
618
|
+
sub_join = subquery_jc_path(sub_join, join_alias)
|
619
|
+
rebind = rebind_subquery(sub_path, join_alias)
|
620
|
+
select_nodes = subquery_select_nodes(rebind, main_path, sub_join)
|
621
|
+
|
622
|
+
join_conditions = jc_path_to_join_conditions(sub_join, rebind,
|
623
|
+
select_nodes)
|
624
|
+
|
625
|
+
select_nodes = select_nodes.keys.collect {|b|
|
626
|
+
b.to_s << ' AS ' << rebind[b].field
|
627
|
+
}.join(', ')
|
628
|
+
|
629
|
+
tables, conditions = jc_path_to_tables_and_conditions(sub_path)
|
630
|
+
|
631
|
+
join_target = "(\nSELECT #{select_nodes}\nFROM #{tables}"
|
632
|
+
join_target << "\nWHERE " << conditions unless conditions.empty?
|
633
|
+
join_target << "\n)"
|
634
|
+
join_target.gsub!(/\n(?!\)\z)/, "\n ")
|
635
|
+
end
|
636
|
+
|
637
|
+
joins << ("\nLEFT JOIN " + join_target + ' AS ' + join_alias + ' ON ' +
|
638
|
+
join_conditions.uniq.join(' AND '))
|
639
|
+
|
640
|
+
if :must_not_bind == bind_mode
|
641
|
+
left_join_is_null(main_path, sub_join)
|
642
|
+
end
|
643
|
+
end
|
644
|
+
end
|
645
|
+
|
646
|
+
@from, main_where = jc_path_to_tables_and_conditions(main_path)
|
647
|
+
|
648
|
+
@from << joins
|
649
|
+
|
650
|
+
@where.push('(' + main_where + ')') unless main_where.empty?
|
651
|
+
@where.push('(' + @global_filter + ')') unless @global_filter.empty?
|
652
|
+
@where = @where.join("\nAND ")
|
653
|
+
end
|
654
|
+
|
655
|
+
# Produce a subgraph path through join conditions linking all aliases with
|
656
|
+
# given _bind_mode_ that form a same-color connected component of the join
|
657
|
+
# conditions graph and weren't processed yet:
|
658
|
+
#
|
659
|
+
# path = [ [start, []], [ next, [ jc, ... ] ], ... ]
|
660
|
+
#
|
661
|
+
# Update _seen_ hash for all aliases included in the produced path.
|
662
|
+
#
|
663
|
+
def jc_subgraph_path(bind_mode, seen = {})
|
664
|
+
start = find_alias(bind_mode, seen)
|
665
|
+
return nil if start.nil?
|
666
|
+
|
667
|
+
new = {}
|
668
|
+
new[start] = true
|
669
|
+
path = [ [start, []] ]
|
670
|
+
colors = @nodes[ @aliases[start][:node] ][:colors].keys
|
671
|
+
|
672
|
+
loop do # while we can find more connecting joins of the same color
|
673
|
+
join_alias = nil
|
674
|
+
|
675
|
+
@jc.each do |jc|
|
676
|
+
# use cases:
|
677
|
+
# - seen is empty (composing the must-bind join)
|
678
|
+
# - seen is not empty (composing a subquery)
|
679
|
+
|
680
|
+
next if (colors & @nodes[ jc[2] ][:colors].keys).empty?
|
681
|
+
|
682
|
+
0.upto(1) do |i|
|
683
|
+
a_seen = jc[i].alias
|
684
|
+
a_next = jc[1-i].alias
|
685
|
+
|
686
|
+
if not new[a_next] and (
|
687
|
+
((new[a_seen] or seen[a_seen]) and
|
688
|
+
(@aliases[a_next][:bind_mode] == bind_mode)
|
689
|
+
# connect an untouched node of matching bind mode
|
690
|
+
) or (
|
691
|
+
new[a_seen] and seen[a_next] and
|
692
|
+
# connect subquery to the rest of the query...
|
693
|
+
@aliases[a_seen][:bind_mode] == bind_mode
|
694
|
+
# ...but only go one step deep
|
695
|
+
))
|
696
|
+
|
697
|
+
join_alias = a_next
|
698
|
+
break
|
699
|
+
end
|
700
|
+
end
|
701
|
+
|
702
|
+
break if join_alias
|
703
|
+
end
|
704
|
+
|
705
|
+
break if join_alias.nil?
|
706
|
+
|
707
|
+
# join it to all seen aliases
|
708
|
+
join_on = @jc.find_all do |jc|
|
709
|
+
a1, a2 = jc[0, 2].collect {|b| b.alias }
|
710
|
+
(new[a1] and a2 == join_alias) or (new[a2] and a1 == join_alias)
|
711
|
+
end
|
712
|
+
|
713
|
+
new[join_alias] = true
|
714
|
+
path.push([join_alias, join_on])
|
715
|
+
end
|
716
|
+
|
717
|
+
seen.merge!(new)
|
718
|
+
[ path, new ]
|
719
|
+
end
|
720
|
+
|
721
|
+
def find_alias(bind_mode, seen = {})
|
722
|
+
@aliases.each do |alias_name, a|
|
723
|
+
next if seen[alias_name] or a[:bind_mode] != bind_mode
|
724
|
+
return alias_name
|
725
|
+
end
|
726
|
+
|
727
|
+
nil
|
728
|
+
end
|
729
|
+
|
730
|
+
# Ground all must-bind blank nodes that weren't ground elsewhere to an
|
731
|
+
# existential quantifier.
|
732
|
+
#
|
733
|
+
def ground_dangling_blank_nodes(main_path)
|
734
|
+
conditions = []
|
735
|
+
ground_nodes = @global_filter.scan(SquishQuery::BN_SCAN)
|
736
|
+
|
737
|
+
@nodes.each do |node, n|
|
738
|
+
next if (n[:ground] or ground_nodes.include?(node))
|
739
|
+
|
740
|
+
expression =
|
741
|
+
case n[:bind_mode]
|
742
|
+
when :must_bind
|
743
|
+
'IS NOT NULL'
|
744
|
+
when :must_not_bind
|
745
|
+
'IS NULL'
|
746
|
+
else
|
747
|
+
next
|
748
|
+
end
|
749
|
+
|
750
|
+
@bindings[node].each do |binding|
|
751
|
+
if main_path.assoc(binding.alias)
|
752
|
+
conditions.push SqlExpression.new(binding, expression)
|
753
|
+
break
|
754
|
+
end
|
755
|
+
end
|
756
|
+
end
|
757
|
+
|
758
|
+
conditions
|
759
|
+
end
|
760
|
+
|
761
|
+
# Join a subquery to the main query: for each alias shared between the two,
|
762
|
+
# link 'id' field of the corresponding table within and outside the subquery.
|
763
|
+
# If no node is bound to the 'id' field, create a virtual node bound to it,
|
764
|
+
# so that it can be rebound by rebind_subquery().
|
765
|
+
#
|
766
|
+
def subquery_jc_path(sub_join, join_alias)
|
767
|
+
sub_join.empty? and raise ProgrammingError,
|
768
|
+
"Unexpected empty subquery, check your RDF storage configuration"
|
769
|
+
# fixme: reify instead of raising an exception
|
770
|
+
|
771
|
+
sub_join.transpose[0].uniq.collect do |a|
|
772
|
+
binding = SqlNodeBinding.new(a, 'id')
|
773
|
+
|
774
|
+
exists = false
|
775
|
+
@nodes.each do |node, n|
|
776
|
+
if @bindings[node].include?(binding)
|
777
|
+
exists = true
|
778
|
+
break
|
779
|
+
end
|
780
|
+
end
|
781
|
+
|
782
|
+
unless exists
|
783
|
+
node = '?' + join_alias + '_' + a
|
784
|
+
@nodes[node] = { :ground => true }
|
785
|
+
@bindings[node] = [ binding ]
|
786
|
+
end
|
787
|
+
|
788
|
+
[ a, [[ binding, binding ]] ]
|
789
|
+
end
|
790
|
+
end
|
791
|
+
|
792
|
+
# Generate a hash that maps all bindings that's been wrapped inside the
|
793
|
+
# _sub_query_ (a jc path, see jc_subquery_path()) to rebound bindings based
|
794
|
+
# on the _join_alias_ so that they may still be used in the main query.
|
795
|
+
#
|
796
|
+
def rebind_subquery(sub_path, join_alias)
|
797
|
+
rebind = {}
|
798
|
+
field_count = 'a'
|
799
|
+
|
800
|
+
wrapped = {}
|
801
|
+
sub_path.each {|a,| wrapped[a] = true }
|
802
|
+
|
803
|
+
@nodes.each do |node, n|
|
804
|
+
@bindings[node].each do |b|
|
805
|
+
if wrapped[b.alias] and rebind[b].nil?
|
806
|
+
field = '_field_' << field_count
|
807
|
+
field_count.next!
|
808
|
+
rebind[b] = SqlNodeBinding.new(join_alias, field)
|
809
|
+
end
|
810
|
+
end
|
811
|
+
end
|
812
|
+
|
813
|
+
rebind
|
814
|
+
end
|
815
|
+
|
816
|
+
# Go through global filter, filters in the main query, and join conditions
|
817
|
+
# attaching the subquery to the main query, rebind the bindings for nodes
|
818
|
+
# wrapped inside the subquery, and return a hash with keys for all bindings
|
819
|
+
# that should be selected from the subquery.
|
820
|
+
#
|
821
|
+
def subquery_select_nodes(rebind, main_path, sub_join)
|
822
|
+
select_nodes = {}
|
823
|
+
|
824
|
+
# update the global filter
|
825
|
+
@nodes.each do |node, n|
|
826
|
+
if r = rebind[ @bindings[node].first ]
|
827
|
+
@global_filter.gsub!(/#{Regexp.escape(node)}\b/) do
|
828
|
+
select_nodes[ @bindings[node].first ] = true
|
829
|
+
r.to_s
|
830
|
+
end
|
831
|
+
end
|
832
|
+
end
|
833
|
+
|
834
|
+
# update filters in the main query
|
835
|
+
main_path.each do |a,|
|
836
|
+
next if sub_join.assoc(a)
|
837
|
+
|
838
|
+
@aliases[a][:filter].each do |f|
|
839
|
+
f.rebind!(rebind) do |b|
|
840
|
+
select_nodes[b] = true
|
841
|
+
end
|
842
|
+
end
|
843
|
+
end
|
844
|
+
|
845
|
+
# update the subquery join path
|
846
|
+
sub_join.each do |a, jcs|
|
847
|
+
jcs.each do |jc|
|
848
|
+
select_nodes[ jc[0] ] = true
|
849
|
+
jc[1] = rebind[ jc[1] ]
|
850
|
+
end
|
851
|
+
end
|
852
|
+
|
853
|
+
# fixme: update main SELECT list
|
854
|
+
select_nodes
|
855
|
+
end
|
856
|
+
|
857
|
+
# Transform jc path (see jc_subgraph_path()) into a list of join conditions.
|
858
|
+
# If _rebind_ and _select_nodes_ hashes are defined, conditions will be
|
859
|
+
# rebound accordingly, and _select_nodes_ will be updated to include bindings
|
860
|
+
# used in the conditions.
|
861
|
+
#
|
862
|
+
def jc_path_to_join_conditions(jc_path, rebind = nil, select_nodes = nil)
|
863
|
+
conditions = []
|
864
|
+
|
865
|
+
jc_path.each do |a, jcs|
|
866
|
+
jcs.each do |b1, b2, n|
|
867
|
+
conditions.push SqlExpression.new(b1, '=', b2)
|
868
|
+
end
|
869
|
+
end
|
870
|
+
|
871
|
+
conditions.empty? and raise RuntimeError,
|
872
|
+
"Failed to join subquery to the main query"
|
873
|
+
|
874
|
+
conditions
|
875
|
+
end
|
876
|
+
|
877
|
+
# Generate FROM and WHERE clauses from a jc path (see jc_subgraph_path()).
|
878
|
+
#
|
879
|
+
def jc_path_to_tables_and_conditions(path)
|
880
|
+
first, = path[0]
|
881
|
+
a = @aliases[first]
|
882
|
+
|
883
|
+
tables = a[:table] + ' AS ' + first
|
884
|
+
conditions = a[:filter]
|
885
|
+
|
886
|
+
path[1, path.size - 1].each do |join_alias, join_on|
|
887
|
+
a = @aliases[join_alias]
|
888
|
+
|
889
|
+
tables <<
|
890
|
+
%{\nINNER JOIN #{a[:table]} AS #{join_alias} ON } <<
|
891
|
+
(
|
892
|
+
join_on.collect {|b1, b2| SqlExpression.new(b1, '=', b2) } +
|
893
|
+
a[:filter]
|
894
|
+
).uniq.join(' AND ')
|
895
|
+
end
|
896
|
+
|
897
|
+
[ tables, conditions.uniq.join("\nAND ") ]
|
898
|
+
end
|
899
|
+
|
900
|
+
# Find and declare as NULL key fields of a must-not-bind subquery.
|
901
|
+
#
|
902
|
+
def left_join_is_null(main_path, sub_join)
|
903
|
+
sub_join.each do |a, jcs|
|
904
|
+
jcs.each do |jc|
|
905
|
+
0.upto(1) do |i|
|
906
|
+
if main_path.assoc(jc[i].alias).nil?
|
907
|
+
@where.push SqlExpression.new(jc[i], 'IS NULL')
|
908
|
+
break
|
909
|
+
end
|
910
|
+
end
|
911
|
+
end
|
912
|
+
end
|
913
|
+
end
|
914
|
+
end
|
915
|
+
|
916
|
+
end
|