sql-ferret 0.4.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +7 -0
- data/GPL-3 +674 -0
- data/History.txt +3 -0
- data/Manifest.txt +6 -0
- data/README +294 -0
- data/lib/sql-ferret.rb +1719 -0
- data/sql-ferret.gemspec +18 -0
- metadata +66 -0
data/History.txt
ADDED
data/Manifest.txt
ADDED
data/README
ADDED
@@ -0,0 +1,294 @@
|
|
1
|
+
The SQL Ferret wraps SQLite into a navigational database style
|
2
|
+
interface.
|
3
|
+
|
4
|
+
BEWARE: this is raw and ugly EXPERIMENTAL code, and its API is
|
5
|
+
VERY LIKELY to change before 1.0.
|
6
|
+
|
7
|
+
|
8
|
+
== Overview
|
9
|
+
|
10
|
+
The [[Ferret::new]] constructor takes two arguments: a
|
11
|
+
(multiline) string containing the Ferret Data Schema Description
|
12
|
+
and a previously opened [[SQLite3::Database]] instance for
|
13
|
+
accessing an SQLite database with such a schema. The resulting
|
14
|
+
[[Ferret]] instance's primary useful method is [[go]]; it takes
|
15
|
+
one mandatory argument -- the Ferret query string -- and may
|
16
|
+
also take numbered and named arguments, as well as a block.
|
17
|
+
|
18
|
+
Note that Ferret has TWO distinct DSLs: one for defining the
|
19
|
+
data model, one for querying and manipulating it. When the
|
20
|
+
Ferret schema is stored in a text file, it's customarily given a
|
21
|
+
name in the form of [[foo.fers]]. Ferret query expressions are
|
22
|
+
typically inlined in Ruby code.
|
23
|
+
|
24
|
+
|
25
|
+
== Ferret query language
|
26
|
+
|
27
|
+
The simplest form of a Ferret query expression is a query over
|
28
|
+
one table, filtering by one input column, and producing one
|
29
|
+
output column:
|
30
|
+
|
31
|
+
<table> ':' <input-field> '->' <output-field>
|
32
|
+
|
33
|
+
Such an expression corresponds to the SQL of
|
34
|
+
|
35
|
+
SELECT <output-field> FROM <table> WHERE <input-field> = ?
|
36
|
+
|
37
|
+
Note that in order to process such an expression, [[Ferret#go]]
|
38
|
+
requires an extra argument besides the expression -- the
|
39
|
+
exemplar value for [[<input-field>]]. The separation of
|
40
|
+
expression from data is a deliberate design feature of Ferret:
|
41
|
+
on one hand, it's believed to make the expressions clearer; on
|
42
|
+
another, it provides a measure of protection against inadvertent
|
43
|
+
XSS vulnerabilities.
|
44
|
+
|
45
|
+
The <input-field> can be omitted. In such a case, the query
|
46
|
+
fetches all rows from <table>. If more than one input field is
|
47
|
+
supplied, they must be separated by commas and surrounded by
|
48
|
+
parentheses, like this. (Rationale: Ferret's [[->]] operator,
|
49
|
+
normally binds very weakly on its left hand side. While it
|
50
|
+
would not be hard to write an exception into the parser, it is
|
51
|
+
believed that permitting the surrounding parentheses to be
|
52
|
+
omitted is likely to lead to confusing Ferret query
|
53
|
+
expressions.)
|
54
|
+
|
55
|
+
Multiple output fields can be specified by separating them with
|
56
|
+
commas. Surrounding parentheses are not necessary or permitted
|
57
|
+
around the right-hand side of an arrow.
|
58
|
+
|
59
|
+
More complex queries involve multiple tables and what relational
|
60
|
+
algebra calls /joins/. Since Ferret aims to provide a
|
61
|
+
navigational rather than purely relational interface, it
|
62
|
+
presents joins as /dereferencing/, denoted by a trailing [[->]]
|
63
|
+
operator. That is, the query
|
64
|
+
|
65
|
+
houses: number -> resident -> name, phone
|
66
|
+
|
67
|
+
can correspond to the SQL of
|
68
|
+
|
69
|
+
SELECT house.number, house.resident, resident.name,
|
70
|
+
resident.age
|
71
|
+
FROM houses LEFT JOIN residents ON
|
72
|
+
houses.resident = residents.id
|
73
|
+
WHERE house.number = ?
|
74
|
+
|
75
|
+
provided that the data schema specifies that the column of
|
76
|
+
[[houses.resident]] refers to [[resident]] through its [[id]].
|
77
|
+
(In SQL parlance, it needs to be defined as a foreign key.) If,
|
78
|
+
instead of [[left join]], an [[inner join]] is desired, the
|
79
|
+
two-ended dereferencing arrow [[<->]] needs to be used instead
|
80
|
+
of [[->]].
|
81
|
+
|
82
|
+
A Ferret data schema permitting such translation might look
|
83
|
+
roughly thus:
|
84
|
+
|
85
|
+
[houses]
|
86
|
+
id: primary key, integer
|
87
|
+
number: optional integer
|
88
|
+
name: optional varchar
|
89
|
+
street: varchar
|
90
|
+
resident: optional ref residents(id)
|
91
|
+
|
92
|
+
[residents]
|
93
|
+
id: primary key, integer
|
94
|
+
name: varchar = 'John Smith'
|
95
|
+
phone: optional varchar
|
96
|
+
|
97
|
+
Note that the columns are by default _not_ nullable but they can
|
98
|
+
be explicitly defined as nullable by the keyword [[optional]].
|
99
|
+
The [[=]] character followed by an SQL expression specifies the
|
100
|
+
default value for a column.
|
101
|
+
|
102
|
+
Also note that in the definition of [[resident: optional ref
|
103
|
+
residents(id)]], the [[(id)]] can be omitted because it's clear
|
104
|
+
from context -- [[residents.id]] is the primary key of
|
105
|
+
[[residents]].
|
106
|
+
|
107
|
+
This data schema permits only up to one resident per house.
|
108
|
+
What if the house->resident relation needs to have an 1->n shape
|
109
|
+
rather than 1->0..1? We could move the linking column from
|
110
|
+
[[houses]] to [[residents]], like this:
|
111
|
+
|
112
|
+
[houses]
|
113
|
+
id: primary key, integer
|
114
|
+
number: optional integer
|
115
|
+
name: optional varchar
|
116
|
+
street: varchar
|
117
|
+
resident: ghost ref residents(house)
|
118
|
+
|
119
|
+
[residents]
|
120
|
+
id: primary key, integer
|
121
|
+
name: varchar = 'John Smith'
|
122
|
+
phone: optional varchar
|
123
|
+
house: optional ref residents
|
124
|
+
|
125
|
+
Note that we're still defining [[houses.resident]] but it's no
|
126
|
+
longer a /column/ -- that is, it does not have a matching SQL
|
127
|
+
table column anymore --, but a /ghost field/.
|
128
|
+
|
129
|
+
Besides being primary keys, columns can be defined merely
|
130
|
+
[[unique]]. Ferret does not currently support composite
|
131
|
+
secondary keys, but a future version might.
|
132
|
+
|
133
|
+
How does [[Ferret#go]] deliver its results? It depends. If a
|
134
|
+
block is given to it, it will call this block with each row;
|
135
|
+
otherwise, it collects rows and returns them. (Actually, if
|
136
|
+
Ferret can prove, using [[unique]] and [[primary key]]
|
137
|
+
constraints, that the query necessarily produces 0 or 1 rows, it
|
138
|
+
will return either [[nil]] or the one row; otherwise, it will
|
139
|
+
return an array of the rows.) If the query specifies one column
|
140
|
+
(which may be precededed by dereferences); each 'row' will be
|
141
|
+
the value without encapsulation; otherwise, Ferret wraps rows
|
142
|
+
into [[OpenStruct]] instances. The multicolumn behaviour can be
|
143
|
+
forced by adding an explicit trailing comma after what would
|
144
|
+
otherwise be the single requested column. (Rationale: while
|
145
|
+
these rules are a bit clumsy to specify, they have proven
|
146
|
+
intuitive, in a Perlish way.)
|
147
|
+
|
148
|
+
Each queried column can be given an explicit name, analogously
|
149
|
+
to SQL's [[AS]] clause, by specifying
|
150
|
+
it between apostrophes after the column appears in the
|
151
|
+
expression. Note that this is not a string literal; rather,
|
152
|
+
Ferret parses each apostrophe as a token, and the explicit name
|
153
|
+
must parse as a valid Ferret identifier token.
|
154
|
+
|
155
|
+
Star topology joins can be specified by surrounding a joining
|
156
|
+
arrow together with its right-hand side in parentheses, like
|
157
|
+
this:
|
158
|
+
|
159
|
+
houses: number -> resident (-> name, phone), street
|
160
|
+
|
161
|
+
Such parentheses can be nested.
|
162
|
+
|
163
|
+
In addition to retrieval, Ferret query expressions also support
|
164
|
+
modification and deletion of entries. This is notated by
|
165
|
+
terminating an expression in a 'blank' dereference operator
|
166
|
+
followed by a colon and a verb, like this:
|
167
|
+
|
168
|
+
houses: number -> resident ->: set
|
169
|
+
|
170
|
+
The fields to be changed will then be specified as named
|
171
|
+
arguments to [[Ferret#go]]. The verb [[update]] can also be
|
172
|
+
used instead of [[set]]; it has exactly the same meaning. When
|
173
|
+
the verb [[delete]] is used, [[Ferret#go]] does not take any
|
174
|
+
named arguments.
|
175
|
+
|
176
|
+
Outside the Ferret query expression mechanism, there's the
|
177
|
+
[[Ferret#insert]] method that takes the target table's name as a
|
178
|
+
mandatory argument and the values to be inserted as named
|
179
|
+
arguments, like this:
|
180
|
+
|
181
|
+
$ferret.insert 'residents',
|
182
|
+
house: 8,
|
183
|
+
name: 'Jacob Doe',
|
184
|
+
phone: '555-1212'
|
185
|
+
|
186
|
+
A future version of Ferret API is likely to provide record
|
187
|
+
insertion through [[Ferret#go]]. The reason we're not doing it
|
188
|
+
in this public release is that our autovivification mechanisms
|
189
|
+
are nowhere near settling yet.
|
190
|
+
|
191
|
+
Also of note is [[Ferret#change]], whose signature matches
|
192
|
+
[[Ferret#insert]] except that it performs the [[INSERT OR
|
193
|
+
REPLACE INTO ...]] operation instead of plain [[INSERT]], and
|
194
|
+
[[Ferret#transaction]], which supports recursive locking.
|
195
|
+
(This is quirky. It's mainly intended for use in library
|
196
|
+
functions that need to group Ferret or SQL operations for
|
197
|
+
atomicity without assuming that an outer transaction exists or
|
198
|
+
does not exist, and it needs care even then. Unless you know
|
199
|
+
you need it, you're probably better off using
|
200
|
+
[[SQLite3::Database#transaction]] directly.)
|
201
|
+
|
202
|
+
Instead of a single input value, it's permitted to
|
203
|
+
pass a whole collection of input values to [[Ferret#go]] --
|
204
|
+
then, Ferret uses [[foo in (?, ...)]] instead of [[foo = ?]],
|
205
|
+
and won't consider this column's possible declared uniquity when
|
206
|
+
deciding whether the query is a single-row query --, or [[nil]],
|
207
|
+
in which case Ferret uses [[foo is null]] for proper SQL-style
|
208
|
+
nullity checking. (A 'collection' is defined through duck
|
209
|
+
typing -- anything that produces more than one value when the
|
210
|
+
[[*]] prefix operator is applied to it.)
|
211
|
+
|
212
|
+
When it's desired that [[Ferret#go]] produce distinct values, a
|
213
|
+
trailing [[: distinct]] or [[: select distinct]] can be used.
|
214
|
+
(These two are synonymous.) Note that for query-type verbs, the
|
215
|
+
colon *must not* be preceded by a blank-RHS dereferencing arrow,
|
216
|
+
unlike for mutation-type verbs, which require it.
|
217
|
+
|
218
|
+
Besides straight values, Ferret supports interpreted values.
|
219
|
+
The set of such is currently hardcoded and is:
|
220
|
+
|
221
|
+
iso8601
|
222
|
+
unix_time
|
223
|
+
subsecond_unix_time
|
224
|
+
json
|
225
|
+
pretty_json
|
226
|
+
yaml
|
227
|
+
ruby_marshal
|
228
|
+
packed_hex
|
229
|
+
|
230
|
+
When a Ferret schema assigns an interpreted rather than straight
|
231
|
+
data type to a column, [[Ferret#go]] will automatically
|
232
|
+
interpret and 'deterpret' values for this column, unless the
|
233
|
+
column's name is prefixed with a backslash in the expression.
|
234
|
+
Note that [[Ferret#insert]] does not (currently?) support
|
235
|
+
interpretation, and always processes raw values.
|
236
|
+
|
237
|
+
|
238
|
+
== Likely future development
|
239
|
+
|
240
|
+
- 'en passant' filters in addition to 'initial' filters. These
|
241
|
+
will probably be notated by brackets, and may permit
|
242
|
+
ordering comparison in addition to equality checks;
|
243
|
+
- use of [[Range]] values in addition to collections as filters
|
244
|
+
passed to [[Ferret#go]];
|
245
|
+
- explicit ordering of the produced rows, probably via unary
|
246
|
+
postfix operators;
|
247
|
+
- SQL grouping and aggregate functions;
|
248
|
+
- [[Ferret#go]] returning rows as a [[Hash]] instead of an
|
249
|
+
[[Array]], using a given key or keys;
|
250
|
+
- Kleene dereferencing arrows [[-*>]] and [[-+>]] (and their
|
251
|
+
double-ended [[INNER JOIN]] counterparts), implemented via
|
252
|
+
SQLite's recently introduced [[WITH RECURSIVE]] construct;
|
253
|
+
- multi-column uniquity constraints and foreign keys;
|
254
|
+
- automated handling of SIKR (Strictly Incremental Knowledge
|
255
|
+
Representation) packets so that Fossil-style multinode
|
256
|
+
tracking could be implemented for (nearly) arbitrary data
|
257
|
+
structures;
|
258
|
+
- accessing SQL's views;
|
259
|
+
- customisable autovivification in multistage insertions;
|
260
|
+
- defining [[Hash]]-like interfaces atop [[Ferret]] that would
|
261
|
+
be backed with a custom, potentially joined, relation in the
|
262
|
+
underlying SQL table;
|
263
|
+
- defining per-column access-controlled [[Ferret]]-like subAPIs;
|
264
|
+
- integration of [[insert]] and [[change]] into [[Ferret#go]];
|
265
|
+
- an interface for Android's built-in SQLite API as available
|
266
|
+
through Ruboto, as an alternative to the [[sqlite3]] Rubygem
|
267
|
+
which is not available on Ruboto;
|
268
|
+
- better documentation;
|
269
|
+
- renaming to avoid clashing with
|
270
|
+
<https://github.com/jkraemer/ferret>.
|
271
|
+
|
272
|
+
|
273
|
+
== Possible future development
|
274
|
+
|
275
|
+
- tracking the underlying SQLite database's schema via [[PRAGMA
|
276
|
+
user_version]] and automatically upgrading it;
|
277
|
+
- command line tools for database setup and data import and
|
278
|
+
export;
|
279
|
+
- transparently joining another table so as to implement
|
280
|
+
[[is-a]] type relation atop a SQL data model;
|
281
|
+
- transparently interpreting JSON or YAML data as extra payload
|
282
|
+
fields of their containing table without explicit formal
|
283
|
+
specification, akin to MongoDB;
|
284
|
+
- transparent compression of blob/YAML/JSON fields;
|
285
|
+
- extracting column type data, constraints, and foreign keys
|
286
|
+
from the [[sqlite_master]] data so that the Ferret schema
|
287
|
+
would only need to specify ghost fields and interpretations;
|
288
|
+
- a notation for /ad hoc/ joins;
|
289
|
+
- a third, hybrid DSL: Ferret dereference operator extensions to
|
290
|
+
basic SQL queries, something like [[SELECT number, resident ->
|
291
|
+
name, resident -> phone FROM houses WHERE id = ?]] or perhaps
|
292
|
+
[[SELECT number, resident -> (name, phone) FROM ...]];
|
293
|
+
- custom enums as interpretations;
|
294
|
+
- global primary key definition in the schema.
|
data/lib/sql-ferret.rb
ADDED
@@ -0,0 +1,1719 @@
|
|
1
|
+
require 'json'
|
2
|
+
require 'ostruct'
|
3
|
+
require 'set'
|
4
|
+
require 'time'
|
5
|
+
require 'ugh'
|
6
|
+
require 'yaml'
|
7
|
+
|
8
|
+
class Ferret
|
9
|
+
module Constants
|
10
|
+
QSF_MULTICOL = 0x01
|
11
|
+
QSF_MULTIROW = 0x02
|
12
|
+
end
|
13
|
+
|
14
|
+
attr_reader :schema
|
15
|
+
|
16
|
+
# If the caller can guarantee that this [[Ferret]] instance
|
17
|
+
# is never accessed from multiple threads at once, it can
|
18
|
+
# turn off using the internal mutex by passing [[use_mutex:
|
19
|
+
# false]] for a small performance increase.
|
20
|
+
def initialize schema_source, sqlite = nil, use_mutex: true
|
21
|
+
raise 'type mismatch' unless schema_source.is_a? String
|
22
|
+
super()
|
23
|
+
@schema = Ferret::Schema.new(schema_source)
|
24
|
+
@sqlite = sqlite
|
25
|
+
# Guards access to [[@sqlite]] and
|
26
|
+
# [[@sqlite_locked]].
|
27
|
+
@sync_mutex = use_mutex ? Mutex.new : nil
|
28
|
+
# Are we currently in a transaction? (This lets us
|
29
|
+
# implement [[Ferret#transaction]] reëntrantly.)
|
30
|
+
@sqlite_locked = false
|
31
|
+
return
|
32
|
+
end
|
33
|
+
|
34
|
+
def change table_name, **changes
|
35
|
+
ugh? attempted: 'ferret-change' do
|
36
|
+
table = @schema[table_name] or
|
37
|
+
ugh 'unknown-table', table: table_name
|
38
|
+
sql = table.sql_to_change changes.keys.map(&:to_s)
|
39
|
+
_sync{@sqlite.execute sql, **changes}
|
40
|
+
end
|
41
|
+
return
|
42
|
+
end
|
43
|
+
|
44
|
+
def insert table_name, **changes
|
45
|
+
ugh? attempted: 'ferret-insert' do
|
46
|
+
table = @schema[table_name] or
|
47
|
+
ugh 'unknown-table', table: table_name
|
48
|
+
sql = table.sql_to_insert changes.keys.map(&:to_s)
|
49
|
+
_sync do
|
50
|
+
@sqlite.execute sql, **changes
|
51
|
+
return @sqlite.last_insert_row_id
|
52
|
+
end
|
53
|
+
end
|
54
|
+
end
|
55
|
+
|
56
|
+
def transaction &thunk
|
57
|
+
# Note that [[_sync]] is reentrant, too.
|
58
|
+
_sync do
|
59
|
+
# If we get to this point, the only 'concurrent' access
|
60
|
+
# might come from our very own thread -- that is, a
|
61
|
+
# subroutine down the execution stack from present. This
|
62
|
+
# means that we can now access [[@sqlite_locked]] as
|
63
|
+
# though we were in a single-threading environment, and
|
64
|
+
# thus use it as a flag for 'this thread has already
|
65
|
+
# acquired the SQLite-level lock so there's no need to
|
66
|
+
# engage it again'. (SQLite's transaction mechanism on
|
67
|
+
# its own is not reentrant.)
|
68
|
+
if @sqlite_locked then
|
69
|
+
return yield
|
70
|
+
else
|
71
|
+
return @sqlite.transaction do
|
72
|
+
begin
|
73
|
+
@sqlite_locked = true
|
74
|
+
return yield
|
75
|
+
ensure
|
76
|
+
@sqlite_locked = false
|
77
|
+
end
|
78
|
+
end
|
79
|
+
end
|
80
|
+
end
|
81
|
+
end
|
82
|
+
|
83
|
+
def _sync &thunk
|
84
|
+
if @sync_mutex.nil? or @sync_mutex.owned? then
|
85
|
+
return yield
|
86
|
+
else
|
87
|
+
@sync_mutex.synchronize &thunk
|
88
|
+
end
|
89
|
+
end
|
90
|
+
private :_sync
|
91
|
+
|
92
|
+
def go raw_expr, *inputs, **changes, &thunk
|
93
|
+
expr = Ferret::Expression_Parser.new(raw_expr, @schema).expr
|
94
|
+
|
95
|
+
ugh? expr: raw_expr do
|
96
|
+
ugh? attempted: 'ferret-go' do
|
97
|
+
if inputs.length > expr.exemplars.length then
|
98
|
+
ugh 'too-many-exemplars-given',
|
99
|
+
expected: expr.exemplars.length,
|
100
|
+
given: inputs.length
|
101
|
+
elsif inputs.length < expr.exemplars.length then
|
102
|
+
ugh 'not-enough-exemplars-given',
|
103
|
+
expected: expr.exemplars.length,
|
104
|
+
given: inputs.length
|
105
|
+
end
|
106
|
+
end
|
107
|
+
|
108
|
+
if thunk and ![:select, :select_distinct].
|
109
|
+
include? expr.type then
|
110
|
+
ugh 'superfluous-thunk-supplied',
|
111
|
+
explanation: 'query-not-a-select'
|
112
|
+
end
|
113
|
+
|
114
|
+
case expr.type
|
115
|
+
when :select, :select_distinct then
|
116
|
+
ugh? attempted: 'ferret-select' do
|
117
|
+
ugh 'superfluous-changes' \
|
118
|
+
unless changes.empty?
|
119
|
+
|
120
|
+
ast = expr.select
|
121
|
+
|
122
|
+
# At least for now, all the parameters behave as
|
123
|
+
# simple ANDed filter rules.
|
124
|
+
inputs_imply_single_row = false
|
125
|
+
coll = Ferret::Parameter_Collector.new
|
126
|
+
expr.exemplars.zip(inputs).each_with_index do
|
127
|
+
|(exemplar_spec, input), seq_no|
|
128
|
+
test, selects_one_p = coll.feed input, exemplar_spec
|
129
|
+
inputs_imply_single_row |= selects_one_p
|
130
|
+
ast.sql.gsub! /\[test\s+#{seq_no}\]/, test
|
131
|
+
end
|
132
|
+
|
133
|
+
# Let's now compose the framework of executing the
|
134
|
+
# query from [[proc]]:s.
|
135
|
+
|
136
|
+
# [[tuple_preparer]] takes a tuple of raw values
|
137
|
+
# fetched from SQL and prepares it into a deliverable
|
138
|
+
# object.
|
139
|
+
tuple_preparer = if ast.shape & QSF_MULTICOL == 0 then
|
140
|
+
# A single column was requested. The deliverable
|
141
|
+
# object is the piece of data from this column.
|
142
|
+
proc do |row|
|
143
|
+
Ferret.interpret ast.outputs.values.first,
|
144
|
+
row.first
|
145
|
+
end
|
146
|
+
else
|
147
|
+
# Multiple columns were requested (or one column in
|
148
|
+
# multicolumn mode). The deliverable object is an
|
149
|
+
# [[OpenStruct]] mapping field names to data from
|
150
|
+
# these fields.
|
151
|
+
proc do |row|
|
152
|
+
output = OpenStruct.new
|
153
|
+
raise 'assertion failed' \
|
154
|
+
unless row.length == ast.outputs.size
|
155
|
+
# Note that we're relying on modern Ruby's
|
156
|
+
# [[Hash]]'s retention of key order here.
|
157
|
+
ast.outputs.to_a.each_with_index do
|
158
|
+
|(name, interpretation), i|
|
159
|
+
output[name] =
|
160
|
+
Ferret.interpret interpretation, row[i]
|
161
|
+
end
|
162
|
+
output
|
163
|
+
end
|
164
|
+
end
|
165
|
+
|
166
|
+
# [[query_executor]] takes a [[proc]], executes the
|
167
|
+
# query, and calls [[proc]] with each tuple prepared
|
168
|
+
# by [[tuple_preparer]].
|
169
|
+
query_executor = proc do |&result_handler|
|
170
|
+
@sqlite.execute ast.sql, **coll do |row|
|
171
|
+
result_handler.call tuple_preparer.call(row)
|
172
|
+
end
|
173
|
+
end
|
174
|
+
|
175
|
+
# [[processor]] executes the query and delivers
|
176
|
+
# results either by yielding to [[thunk]] if it has
|
177
|
+
# been given or by returning them if not, taking into
|
178
|
+
# account the query's shape.
|
179
|
+
if thunk then
|
180
|
+
# A thunk was supplied -- we'll just pass prepared
|
181
|
+
# rows to it.
|
182
|
+
processor = proc do
|
183
|
+
query_executor.call &thunk
|
184
|
+
end
|
185
|
+
else
|
186
|
+
# Why [[and]] here? Well, the shape flag tells us
|
187
|
+
# whether the query can translate one input to more
|
188
|
+
# than one, and [[inputs_imply_single_row]] tells us
|
189
|
+
# whether there are more than one input values that
|
190
|
+
# thus get translated. We can only know that the
|
191
|
+
# result is a single-row table if both of these
|
192
|
+
# preconditions are satisfied.
|
193
|
+
if (ast.shape & QSF_MULTIROW == 0) and
|
194
|
+
inputs_imply_single_row then
|
195
|
+
# A single row was requested (implicitly, by using
|
196
|
+
# a unique field as an exemplar). We'll return
|
197
|
+
# this row, or [[nil]] if nothing was found.
|
198
|
+
processor = lambda do
|
199
|
+
query_executor.call do |output|
|
200
|
+
return output
|
201
|
+
end
|
202
|
+
return nil
|
203
|
+
end
|
204
|
+
else
|
205
|
+
# Many rows were requested. We'll collect them to
|
206
|
+
# a list and return it.
|
207
|
+
processor = proc do
|
208
|
+
results = []
|
209
|
+
query_executor.call do |output|
|
210
|
+
results.push output
|
211
|
+
end
|
212
|
+
return results
|
213
|
+
end
|
214
|
+
end
|
215
|
+
end
|
216
|
+
|
217
|
+
_sync &processor
|
218
|
+
end
|
219
|
+
|
220
|
+
when :update then
|
221
|
+
ugh? attempted: 'ferret-update' do
|
222
|
+
ugh 'missing-changes' \
|
223
|
+
if changes.empty?
|
224
|
+
|
225
|
+
changed_table = expr.stages.last.table
|
226
|
+
sql = "update #{changed_table.name} set "
|
227
|
+
changes.keys.each_with_index do |fn, i|
|
228
|
+
field = changed_table[fn.to_s]
|
229
|
+
ugh 'unknown-field', field: fn,
|
230
|
+
table: changed_table.name,
|
231
|
+
role: 'changed-field' \
|
232
|
+
unless field
|
233
|
+
sql << ", " unless i.zero?
|
234
|
+
sql << "#{field.name} = :#{fn}"
|
235
|
+
end
|
236
|
+
|
237
|
+
if expr.stages.length > 1 then
|
238
|
+
ast = expr.select
|
239
|
+
sql << " where " <<
|
240
|
+
expr.stages.last.stalk.ref.name <<
|
241
|
+
" in (#{ast.sql})"
|
242
|
+
else
|
243
|
+
# Special case: the criteria and the update live in
|
244
|
+
# a single table, so we won't need to do any joining
|
245
|
+
# or subquerying.
|
246
|
+
unless expr.exemplars.empty? then
|
247
|
+
sql << " " << expr.where_clause
|
248
|
+
end
|
249
|
+
end
|
250
|
+
|
251
|
+
# We're going to pass the changes to
|
252
|
+
# [[SQLite::Database#execute]] in a [[Hash]].
|
253
|
+
# Unfortunately, the Ruby interface of SQLite does not
|
254
|
+
# support mixing numbered and named arguments. As a
|
255
|
+
# workaround, we'll pass the etalon as a named
|
256
|
+
# argument whose name is a number. This is also
|
257
|
+
# convenient because it avoids clashes with any other
|
258
|
+
# named parameters -- those are necessarily column
|
259
|
+
# names, and column names can not be numbers.
|
260
|
+
coll = Ferret::Parameter_Collector.new
|
261
|
+
expr.exemplars.zip(inputs).each_with_index do
|
262
|
+
|(exemplar_spec, input), seq_no|
|
263
|
+
test, selects_one_p = coll.feed input, exemplar_spec
|
264
|
+
sql.gsub! /\[test\s+#{seq_no}\]/, test
|
265
|
+
end
|
266
|
+
|
267
|
+
_sync do
|
268
|
+
@sqlite.execute sql, **coll, **changes
|
269
|
+
return @sqlite.changes
|
270
|
+
end
|
271
|
+
end
|
272
|
+
|
273
|
+
when :delete then
|
274
|
+
ugh? attempted: 'ferret-delete' do
|
275
|
+
ugh 'superfluous-changes' \
|
276
|
+
unless changes.empty?
|
277
|
+
|
278
|
+
affected_table = expr.stages.last.table
|
279
|
+
sql = "delete from #{affected_table.name} "
|
280
|
+
|
281
|
+
if expr.stages.length > 1 then
|
282
|
+
ast = expr.select
|
283
|
+
sql << " where " <<
|
284
|
+
expr.stages.last.stalk.ref.name <<
|
285
|
+
" in (#{ast.sql})"
|
286
|
+
else
|
287
|
+
# Special case: the criteria live in the affected
|
288
|
+
# table, so we won't need to do any joining or
|
289
|
+
# subquerying.
|
290
|
+
unless expr.exemplars.empty? then
|
291
|
+
sql << " " << expr.where_clause
|
292
|
+
end
|
293
|
+
end
|
294
|
+
|
295
|
+
coll = Ferret::Parameter_Collector.new
|
296
|
+
expr.exemplars.zip(inputs).each_with_index do
|
297
|
+
|(exemplar_spec, input), seq_no|
|
298
|
+
test, selects_one_p = coll.feed input, exemplar_spec
|
299
|
+
sql.gsub! /\[test\s+#{seq_no}\]/, test
|
300
|
+
end
|
301
|
+
|
302
|
+
_sync do
|
303
|
+
@sqlite.execute sql, **coll
|
304
|
+
return @sqlite.changes
|
305
|
+
end
|
306
|
+
end
|
307
|
+
|
308
|
+
else
|
309
|
+
raise 'assertion failed'
|
310
|
+
end
|
311
|
+
end
|
312
|
+
end
|
313
|
+
|
314
|
+
include Constants
|
315
|
+
|
316
|
+
def pragma_user_version
|
317
|
+
_sync do
|
318
|
+
return @sqlite.get_first_value 'pragma user_version'
|
319
|
+
end
|
320
|
+
end
|
321
|
+
|
322
|
+
def pragma_user_version= new_version
|
323
|
+
raise 'type mismatch' unless new_version.is_a? Integer
|
324
|
+
_sync do
|
325
|
+
@sqlite.execute 'pragma user_version = ?', new_version
|
326
|
+
end
|
327
|
+
return new_version
|
328
|
+
end
|
329
|
+
|
330
|
+
def create_table name
|
331
|
+
ugh? attempted: 'ferret-create-table' do
|
332
|
+
_sync do
|
333
|
+
@sqlite.execute sql_to_create_table(name)
|
334
|
+
end
|
335
|
+
end
|
336
|
+
return
|
337
|
+
end
|
338
|
+
|
339
|
+
def self::interpret interpretation, value
|
340
|
+
# If a [[null]] came from the database, we'll interpret it
|
341
|
+
# as a [[nil]].
|
342
|
+
return nil if value.nil?
|
343
|
+
ugh? interpretation: interpretation.to_s,
|
344
|
+
input: value.inspect do
|
345
|
+
case interpretation
|
346
|
+
when nil then
|
347
|
+
return value
|
348
|
+
when :unix_time, :subsecond_unix_time then
|
349
|
+
ugh 'interpreted-value-type-error',
|
350
|
+
input: value.inspect,
|
351
|
+
expected: 'Numeric' \
|
352
|
+
unless value.is_a? Numeric
|
353
|
+
return Time.at(value)
|
354
|
+
when :iso8601 then
|
355
|
+
ugh 'interpreted-value-type-error',
|
356
|
+
input: value.inspect,
|
357
|
+
expected: 'String' \
|
358
|
+
unless value.is_a? String
|
359
|
+
return Time.xmlschema(value)
|
360
|
+
when :json, :pretty_json then
|
361
|
+
ugh 'interpreted-value-type-error',
|
362
|
+
input: value.inspect,
|
363
|
+
expected: 'String' \
|
364
|
+
unless value.is_a? String
|
365
|
+
return JSON.parse(value)
|
366
|
+
when :yaml then
|
367
|
+
ugh 'interpreted-value-type-error',
|
368
|
+
input: value.inspect,
|
369
|
+
expected: 'String' \
|
370
|
+
unless value.is_a? String
|
371
|
+
return YAML.load(value)
|
372
|
+
when :ruby_marshal then
|
373
|
+
ugh 'interpreted-value-type-error',
|
374
|
+
input: value.inspect,
|
375
|
+
expected: 'String' \
|
376
|
+
unless value.is_a? String
|
377
|
+
return Marshal.load(value)
|
378
|
+
when :packed_hex then
|
379
|
+
ugh 'interpreted-value-type-error',
|
380
|
+
input: value.inspect,
|
381
|
+
expected: 'String' \
|
382
|
+
unless value.is_a? String
|
383
|
+
ugh 'invalid-hex-data',
|
384
|
+
input: value \
|
385
|
+
unless value =~ /\A[\dabcdef]*\Z/
|
386
|
+
ugh 'odd-length-hex-data',
|
387
|
+
input: value \
|
388
|
+
unless value.length % 2 == 0
|
389
|
+
return [value].pack('H*')
|
390
|
+
else
|
391
|
+
raise 'assertion failed'
|
392
|
+
end
|
393
|
+
end
|
394
|
+
end
|
395
|
+
|
396
|
+
def self::deterpret interpretation, object
|
397
|
+
# Note that we're not handling [[nil]] any specially. If
|
398
|
+
# this field permits [[null]] values, it's the caller's --
|
399
|
+
# who lives somewhere in the query execution wrapper of
|
400
|
+
# Ferret -- to handle [[nil]], and if it doesn't, passing
|
401
|
+
# [[nil]] to [[deterpret]] is either an error or, in case of
|
402
|
+
# YAML, requires special escaping.
|
403
|
+
case interpretation
|
404
|
+
when nil then
|
405
|
+
return object
|
406
|
+
when :unix_time then
|
407
|
+
ugh 'deterpreted-value-type-error',
|
408
|
+
input: object.inspect,
|
409
|
+
expected: 'Time' \
|
410
|
+
unless object.is_a? Time
|
411
|
+
return object.to_i
|
412
|
+
when :subsecond_unix_time then
|
413
|
+
ugh 'deterpreted-value-type-error',
|
414
|
+
input: object.inspect,
|
415
|
+
expected: 'Time' \
|
416
|
+
unless object.is_a? Time
|
417
|
+
return object.to_f
|
418
|
+
when :iso8601 then
|
419
|
+
ugh 'deterpreted-value-type-error',
|
420
|
+
input: object.inspect,
|
421
|
+
expected: 'Time' \
|
422
|
+
unless object.is_a? Time
|
423
|
+
return object.xmlschema
|
424
|
+
when :json then
|
425
|
+
return JSON.generate(object)
|
426
|
+
when :pretty_json then
|
427
|
+
return JSON.pretty_generate(object)
|
428
|
+
when :yaml then
|
429
|
+
return YAML.dump(object)
|
430
|
+
when :ruby_marshal then
|
431
|
+
return Marshal.dump(value)
|
432
|
+
when :packed_hex then
|
433
|
+
ugh 'deterpreted-value-type-error',
|
434
|
+
input: object.inspect,
|
435
|
+
expected: 'String' \
|
436
|
+
unless object.is_a? String
|
437
|
+
return object.unpack('H*').first
|
438
|
+
else
|
439
|
+
raise 'assertion failed'
|
440
|
+
end
|
441
|
+
end
|
442
|
+
end
|
443
|
+
|
444
|
+
class Ferret::Alias_Generator
|
445
|
+
def initialize used_ids
|
446
|
+
super()
|
447
|
+
@used_ids = Set.new used_ids
|
448
|
+
@counter = 0
|
449
|
+
return
|
450
|
+
end
|
451
|
+
|
452
|
+
def create prefix
|
453
|
+
begin
|
454
|
+
@counter += 1
|
455
|
+
candidate = prefix + @counter.to_s
|
456
|
+
end while @used_ids.include? candidate
|
457
|
+
@used_ids.add candidate
|
458
|
+
return candidate
|
459
|
+
end
|
460
|
+
|
461
|
+
def available? id
|
462
|
+
return !@used_ids.include?(id)
|
463
|
+
end
|
464
|
+
|
465
|
+
def reserve id
|
466
|
+
if @used_ids.include? id then
|
467
|
+
ugh 'already-reserved', identifier: id
|
468
|
+
end
|
469
|
+
@used_ids.add id
|
470
|
+
return id
|
471
|
+
end
|
472
|
+
end
|
473
|
+
|
474
|
+
class Ferret::Schema
|
475
|
+
def initialize schema_source
|
476
|
+
raise 'type mismatch' unless schema_source.is_a? String
|
477
|
+
super()
|
478
|
+
@tables = {} # keyed by forced-lowercase names
|
479
|
+
lineno = 0
|
480
|
+
curtable = nil
|
481
|
+
relocs = [] # a list of [[Proc]]:s
|
482
|
+
@used_ids = Set.new
|
483
|
+
# so we can avoid clashes when generating aliases;
|
484
|
+
# forced downcase
|
485
|
+
schema_source.each_line do |line|
|
486
|
+
line.strip!
|
487
|
+
lineno += 1
|
488
|
+
ugh? context: 'parsing-ferret-schema',
|
489
|
+
input: line,
|
490
|
+
lineno: lineno do
|
491
|
+
if line.empty? or line[0] == ?# then
|
492
|
+
next
|
493
|
+
elsif line =~ /^\[\s*(\w+)\s*\]\s*(#|$)/ then
|
494
|
+
name = $1
|
495
|
+
dname = name.downcase
|
496
|
+
ugh 'duplicate table name', table: name \
|
497
|
+
if @tables.has_key? dname
|
498
|
+
curtable = @tables[dname] = Ferret::Table.new name
|
499
|
+
@used_ids.add dname
|
500
|
+
elsif line =~ /^(\w+)\s*:\s*/ then
|
501
|
+
name, spec = $1, $'
|
502
|
+
# Note that [[add_field]] will check the field's name
|
503
|
+
# for uniquity.
|
504
|
+
curtable.add_field(
|
505
|
+
Ferret::Field.new(curtable, name, spec) do |thunk|
|
506
|
+
relocs.push thunk
|
507
|
+
end)
|
508
|
+
@used_ids.add name.downcase
|
509
|
+
else
|
510
|
+
ugh 'unparseable-line'
|
511
|
+
end
|
512
|
+
end
|
513
|
+
end
|
514
|
+
# Now that we have loaded everything, we can resolve the
|
515
|
+
# pointers.
|
516
|
+
@tables.each_value do |table|
|
517
|
+
ugh 'table-without-columns',
|
518
|
+
table: table.name \
|
519
|
+
unless table.has_columns?
|
520
|
+
end
|
521
|
+
relocs.each do |thunk|
|
522
|
+
thunk.call self
|
523
|
+
end
|
524
|
+
return
|
525
|
+
end
|
526
|
+
|
527
|
+
def alias_generator
|
528
|
+
return Ferret::Alias_Generator.new(@used_ids)
|
529
|
+
end
|
530
|
+
|
531
|
+
def [] name
|
532
|
+
return @tables[name.downcase]
|
533
|
+
end
|
534
|
+
|
535
|
+
def tables
|
536
|
+
return @tables.values
|
537
|
+
end
|
538
|
+
|
539
|
+
def sql_to_create_table name
|
540
|
+
table = self[name]
|
541
|
+
unless table then
|
542
|
+
ugh 'unknown-table',
|
543
|
+
table: name
|
544
|
+
end
|
545
|
+
return table.sql_to_create
|
546
|
+
end
|
547
|
+
end
|
548
|
+
|
549
|
+
class Ferret::Table
|
550
|
+
attr_reader :name
|
551
|
+
def initialize name
|
552
|
+
raise 'type mismatch' unless name.is_a? String
|
553
|
+
super()
|
554
|
+
@name = name
|
555
|
+
@fields = {} # keyed by forced-lowercase names
|
556
|
+
return
|
557
|
+
end
|
558
|
+
|
559
|
+
def [] name
|
560
|
+
return @fields[name.downcase]
|
561
|
+
end
|
562
|
+
|
563
|
+
def empty?
|
564
|
+
return @fields.empty?
|
565
|
+
end
|
566
|
+
|
567
|
+
def columns
|
568
|
+
return @fields.values.select(&:column?)
|
569
|
+
end
|
570
|
+
|
571
|
+
def has_columns?
|
572
|
+
return @fields.values.any?(&:column?)
|
573
|
+
end
|
574
|
+
|
575
|
+
# FIXME: move to the section for data model
|
576
|
+
attr_reader :primary_key
|
577
|
+
|
578
|
+
# [[Table#add_field]] is how new [[Field]]:s get added to a
|
579
|
+
# [[Table]] as it gets parsed from a Ferret schema. Thus, we
|
580
|
+
# check for field name duplication and primary key clashes
|
581
|
+
# here. This is also a convenient place to set up
|
582
|
+
# [[Table@primary_key]], too, as well as to check against a
|
583
|
+
# table having been declared with multiple primary keys.
|
584
|
+
def add_field field
|
585
|
+
raise 'type mismatch' unless field.is_a? Ferret::Field
|
586
|
+
raise 'assertion failed' \
|
587
|
+
unless field.table.object_id == self.object_id
|
588
|
+
dname = field.name.downcase
|
589
|
+
ugh? table: @name do
|
590
|
+
ugh 'duplicate-field', field: field.name \
|
591
|
+
if @fields.has_key? dname
|
592
|
+
if field.primary_key? then
|
593
|
+
if @primary_key then
|
594
|
+
ugh 'primary-key-clash',
|
595
|
+
key1: @primary_key.name,
|
596
|
+
key2: field.name
|
597
|
+
end
|
598
|
+
@primary_key = field
|
599
|
+
end
|
600
|
+
end
|
601
|
+
@fields[dname] = field
|
602
|
+
return field
|
603
|
+
end
|
604
|
+
|
605
|
+
def sql_to_change given_column_names
|
606
|
+
key_column = sole_unique_column_among given_column_names
|
607
|
+
|
608
|
+
given_columns = resolve_column_names given_column_names
|
609
|
+
|
610
|
+
sql = "insert or replace into " + @name +
|
611
|
+
"(" + columns.map(&:name).join(', ') + ") "
|
612
|
+
|
613
|
+
ag = Ferret::Alias_Generator.new [@name, *@fields.keys]
|
614
|
+
old_alias, new_alias = %w{old new}.map do |prefix|
|
615
|
+
ag.available?(prefix) ?
|
616
|
+
ag.reserve(prefix) :
|
617
|
+
ag.create(prefix)
|
618
|
+
end
|
619
|
+
|
620
|
+
# Specify which field values are new and which ones are to
|
621
|
+
# be retained (or initialised from defaults)
|
622
|
+
sql << "select " << columns.map{|column| '%s.%s' % [
|
623
|
+
given_columns.include?(column) ? new_alias : old_alias,
|
624
|
+
column.name,
|
625
|
+
]}.join(', ')
|
626
|
+
|
627
|
+
# Encode the changes as a subquery
|
628
|
+
sql << " from (select " << given_column_names.map{|fn|
|
629
|
+
":#{fn} as #{fn}"}.join(', ') << ")"
|
630
|
+
|
631
|
+
# Left-join the subquery against the preëxisting table
|
632
|
+
sql << (" as %{new} left join %{table} as %{old} " +
|
633
|
+
"on %{new}.%{key} = %{old}.%{key}") % {
|
634
|
+
:old => old_alias,
|
635
|
+
:new => new_alias,
|
636
|
+
:key => key_column.name,
|
637
|
+
:table => @name,
|
638
|
+
}
|
639
|
+
|
640
|
+
return sql
|
641
|
+
end
|
642
|
+
|
643
|
+
# Given a list of column names, figure out which of them is
|
644
|
+
# the one and only unique (or primary key) field for this
|
645
|
+
# table. Ugh if any of them is not a field name; if any field
|
646
|
+
# is mentioned multiple times; if multiple [[unique]] fields
|
647
|
+
# are mentioned; or if no [[unique]] fields are mentioned.
|
648
|
+
def sole_unique_column_among column_names
|
649
|
+
ugh? table: @name do
|
650
|
+
given_columns = resolve_column_names column_names
|
651
|
+
unique_column = nil
|
652
|
+
given_columns.each do |column|
|
653
|
+
if column.unique? then
|
654
|
+
if unique_column then
|
655
|
+
ugh 'unique-column-conflict',
|
656
|
+
field1: unique_column.name,
|
657
|
+
field2: column.name
|
658
|
+
end
|
659
|
+
unique_column = column
|
660
|
+
end
|
661
|
+
end
|
662
|
+
ugh 'no-unique-column-given',
|
663
|
+
fields: given_columns.map(&:name).join(', '),
|
664
|
+
known_unique_fields:
|
665
|
+
@fields.values.select(&:unique?).
|
666
|
+
map(&:name).join(', ') \
|
667
|
+
unless unique_column
|
668
|
+
return unique_column
|
669
|
+
end
|
670
|
+
end
|
671
|
+
|
672
|
+
def sql_to_insert given_column_names
|
673
|
+
ugh? table: @name do
|
674
|
+
# We have to check this, lest we generate broken SQL.
|
675
|
+
ugh 'inserting-null-tuple' \
|
676
|
+
if given_column_names.empty?
|
677
|
+
|
678
|
+
given_columns = resolve_column_names given_column_names
|
679
|
+
|
680
|
+
# Check that all the mandatory fields are given
|
681
|
+
@fields.each_value do |field|
|
682
|
+
next if field.optional? or field.default
|
683
|
+
next if given_columns.include? field
|
684
|
+
# SQLite can autopopulate the [[integer primary key]]
|
685
|
+
# field.
|
686
|
+
next if field.primary_key? and field.type == 'integer'
|
687
|
+
ugh 'mandatory-value-missing',
|
688
|
+
table: @name,
|
689
|
+
column: field.name,
|
690
|
+
given_columns: given_columns.map(&:name).join(' ')
|
691
|
+
end
|
692
|
+
|
693
|
+
return "insert into " +
|
694
|
+
"#{@name}(#{given_columns.map(&:name).join ', '}) " +
|
695
|
+
"values(:#{given_column_names.join ', :'})"
|
696
|
+
end
|
697
|
+
end
|
698
|
+
|
699
|
+
def resolve_column_names names
|
700
|
+
results = []
|
701
|
+
names.each do |fn|
|
702
|
+
raise 'type mismatch' \
|
703
|
+
unless fn.is_a? String
|
704
|
+
field = @fields[fn.downcase]
|
705
|
+
ugh 'unknown-field', field: fn,
|
706
|
+
known_fields: @fields.values.map(&:name).
|
707
|
+
join(', ') \
|
708
|
+
unless field
|
709
|
+
ugh 'not-a-column', field: field.name \
|
710
|
+
unless field.column?
|
711
|
+
ugh 'duplicate-field', field: field.name \
|
712
|
+
if results.include? field
|
713
|
+
results.push field
|
714
|
+
end
|
715
|
+
return results
|
716
|
+
end
|
717
|
+
|
718
|
+
def sql_to_create
|
719
|
+
# No trailing semicolon.
|
720
|
+
return "create table #{name} (\n " +
|
721
|
+
@fields.values.select(&:column?).
|
722
|
+
map(&:sql_to_declare).join(",\n ") +
|
723
|
+
")"
|
724
|
+
end
|
725
|
+
end
|
726
|
+
|
727
|
+
class Ferret::Lexical_Ruleset
|
728
|
+
attr_reader :multichar
|
729
|
+
|
730
|
+
def initialize simple: [],
|
731
|
+
intertoken: [],
|
732
|
+
multichar: []
|
733
|
+
|
734
|
+
raise 'duck type mismatch' \
|
735
|
+
unless intertoken.respond_to? :include?
|
736
|
+
raise 'duck type mismatch' \
|
737
|
+
unless simple.respond_to? :include?
|
738
|
+
raise 'duck type mismatch' \
|
739
|
+
unless multichar.respond_to? :include?
|
740
|
+
super()
|
741
|
+
@intertoken = intertoken
|
742
|
+
@simple = simple
|
743
|
+
@multichar = multichar
|
744
|
+
return
|
745
|
+
end
|
746
|
+
|
747
|
+
def intertoken? c
|
748
|
+
return @intertoken.include? c
|
749
|
+
end
|
750
|
+
|
751
|
+
def simple_particle? c
|
752
|
+
return @simple.include? c
|
753
|
+
end
|
754
|
+
|
755
|
+
def id_starter? c
|
756
|
+
return [(?A .. ?Z), (?a .. ?z), [?_]].
|
757
|
+
any?{|s| s.include? c}
|
758
|
+
end
|
759
|
+
|
760
|
+
def id_continuer? c
|
761
|
+
return [(?A .. ?Z), (?a .. ?z), (?0 .. ?9), [?_]].
|
762
|
+
any?{|s| s.include? c}
|
763
|
+
end
|
764
|
+
end
|
765
|
+
|
766
|
+
Ferret::LEXICAL_RULESET = Ferret::Lexical_Ruleset.new(
|
767
|
+
simple: ",:*()'\\<>",
|
768
|
+
multichar: %w{-> <-> <= >=},
|
769
|
+
intertoken: " \t\n\r\f")
|
770
|
+
|
771
|
+
class Ferret::Scanner
|
772
|
+
def initialize expr
|
773
|
+
raise 'type mismatch' unless expr.is_a? String
|
774
|
+
super()
|
775
|
+
@expr = expr
|
776
|
+
@lex = Ferret::LEXICAL_RULESET
|
777
|
+
|
778
|
+
@offset_ahead = 0
|
779
|
+
@token_ahead = nil
|
780
|
+
@offset_atail = nil
|
781
|
+
@offset_behind = nil
|
782
|
+
return
|
783
|
+
end
|
784
|
+
|
785
|
+
def _skip_intertoken_space
|
786
|
+
loop do
|
787
|
+
break if @offset_ahead >= @expr.length
|
788
|
+
break unless @lex.intertoken? @expr[@offset_ahead]
|
789
|
+
@offset_ahead += 1
|
790
|
+
end
|
791
|
+
return
|
792
|
+
end
|
793
|
+
private :_skip_intertoken_space
|
794
|
+
|
795
|
+
def peek_token
|
796
|
+
return @token_ahead if @token_ahead
|
797
|
+
|
798
|
+
# Note that [[peek_token]] advances [[@offset_ahead]] to
|
799
|
+
# skip over preceding intertoken space but no further.
|
800
|
+
# Instead, it'll store the end offset of the peeked token
|
801
|
+
# in [[@offset_atail]].
|
802
|
+
_skip_intertoken_space
|
803
|
+
|
804
|
+
# check for eof
|
805
|
+
if @offset_ahead >= @expr.length then
|
806
|
+
@offset_atail = @offset_ahead
|
807
|
+
return @token_ahead = nil
|
808
|
+
end
|
809
|
+
|
810
|
+
# check for an identifier
|
811
|
+
if @lex.id_starter? @expr[@offset_ahead] then
|
812
|
+
@offset_atail = @offset_ahead
|
813
|
+
loop do
|
814
|
+
@offset_atail += 1
|
815
|
+
break unless @lex.id_continuer? @expr[@offset_atail]
|
816
|
+
end
|
817
|
+
return @token_ahead =
|
818
|
+
@expr[@offset_ahead ... @offset_atail]
|
819
|
+
end
|
820
|
+
|
821
|
+
# check for multi-char particles
|
822
|
+
@lex.multichar.each do |etalon|
|
823
|
+
if @expr[@offset_ahead, etalon.length] == etalon then
|
824
|
+
@offset_atail = @offset_ahead + etalon.length
|
825
|
+
return @token_ahead = etalon.to_sym
|
826
|
+
end
|
827
|
+
end
|
828
|
+
|
829
|
+
# check for single-char particles
|
830
|
+
if @lex.simple_particle? @expr[@offset_ahead] then
|
831
|
+
@offset_atail = @offset_ahead + 1
|
832
|
+
return @token_ahead = @expr[@offset_ahead].chr.to_sym
|
833
|
+
end
|
834
|
+
|
835
|
+
# give up
|
836
|
+
ugh 'ferret-lexical-error',
|
837
|
+
input: @expr,
|
838
|
+
offset: @offset_ahead,
|
839
|
+
lookahead: @expr[@offset_ahead, 10],
|
840
|
+
lookbehind: @expr[
|
841
|
+
[@offset_ahead - 10, 0].max ... @offset_ahead]
|
842
|
+
end
|
843
|
+
|
844
|
+
def expected! expectation, **extra
|
845
|
+
# We'll call [[peek_token]] in advance so that
|
846
|
+
# [[@offset_ahead]] would point exactly at the next token.
|
847
|
+
tok = peek_token
|
848
|
+
ugh('ferret-parse-error',
|
849
|
+
expected: expectation,
|
850
|
+
got: (tok || '*eof*').to_s,
|
851
|
+
input: @expr,
|
852
|
+
offset: @offset_ahead,
|
853
|
+
**extra)
|
854
|
+
end
|
855
|
+
|
856
|
+
def _consume_token_ahead
|
857
|
+
raise 'assertion failed' unless @offset_atail
|
858
|
+
@offset_behind = @offset_ahead
|
859
|
+
@offset_ahead = @offset_atail
|
860
|
+
@token_ahead = nil
|
861
|
+
@offset_atail = nil
|
862
|
+
return
|
863
|
+
end
|
864
|
+
private :_consume_token_ahead
|
865
|
+
|
866
|
+
def get_optional_id
|
867
|
+
tok = peek_token
|
868
|
+
if tok.is_a? String then
|
869
|
+
_consume_token_ahead
|
870
|
+
return block_given? ? yield(tok) : tok
|
871
|
+
else
|
872
|
+
return nil
|
873
|
+
end
|
874
|
+
end
|
875
|
+
|
876
|
+
def get_optional_escaped_id expectation
|
877
|
+
escaped_p = pass? :'\\'
|
878
|
+
if escaped_p then
|
879
|
+
return true, get_id(expectation)
|
880
|
+
elsif id = get_optional_id then
|
881
|
+
return false, id
|
882
|
+
else
|
883
|
+
return nil
|
884
|
+
end
|
885
|
+
end
|
886
|
+
|
887
|
+
def get_id expectation
|
888
|
+
return (get_optional_id or expected! expectation)
|
889
|
+
end
|
890
|
+
|
891
|
+
def pass? etalon
|
892
|
+
tok = peek_token
|
893
|
+
if tok == etalon then
|
894
|
+
_consume_token_ahead
|
895
|
+
return true
|
896
|
+
else
|
897
|
+
return false
|
898
|
+
end
|
899
|
+
end
|
900
|
+
|
901
|
+
def pass etalon
|
902
|
+
pass? etalon or expected! etalon
|
903
|
+
return
|
904
|
+
end
|
905
|
+
|
906
|
+
def last_token_offset
|
907
|
+
return @offset_behind
|
908
|
+
end
|
909
|
+
|
910
|
+
def next_token_offset
|
911
|
+
_skip_intertoken_space \
|
912
|
+
unless @token_ahead
|
913
|
+
return @offset_ahead
|
914
|
+
end
|
915
|
+
|
916
|
+
def expected_eof!
|
917
|
+
expected! '*eof*' unless next_token_offset >= @expr.length
|
918
|
+
return
|
919
|
+
end
|
920
|
+
end
|
921
|
+
|
922
|
+
class Ferret::Expression
|
923
|
+
attr_reader :stages
|
924
|
+
attr_reader :selectees
|
925
|
+
attr_reader :exemplars
|
926
|
+
attr_accessor :multicolumn
|
927
|
+
attr_accessor :type
|
928
|
+
|
929
|
+
def initialize
|
930
|
+
super()
|
931
|
+
@stages = [Ferret::Stage.new(nil, nil, :left)]
|
932
|
+
@selectees = []
|
933
|
+
@exemplars = []
|
934
|
+
@multicolumn = false
|
935
|
+
@type = :select # the default
|
936
|
+
return
|
937
|
+
end
|
938
|
+
|
939
|
+
def assign_stage_qualifiers ag
|
940
|
+
raise 'type mismatch' \
|
941
|
+
unless ag.is_a? Ferret::Alias_Generator
|
942
|
+
table_visit_counts = Hash.new 0 # name => count
|
943
|
+
@stages.each_with_index do |stage, i|
|
944
|
+
table_visit_counts[stage.table.name] += 1
|
945
|
+
end
|
946
|
+
|
947
|
+
# The tables that we visited more than once need
|
948
|
+
# distinguishing names.
|
949
|
+
@stages.each do |stage|
|
950
|
+
stage.qualifier =
|
951
|
+
if table_visit_counts[stage.table.name] > 1 then
|
952
|
+
ag.create stage.table.name[0]
|
953
|
+
else
|
954
|
+
stage.table.name
|
955
|
+
end
|
956
|
+
end
|
957
|
+
return
|
958
|
+
end
|
959
|
+
|
960
|
+
def from_clause
|
961
|
+
clause = "from "
|
962
|
+
@stages.each_with_index do |stage, i|
|
963
|
+
# In case of a non-query expression -- a modification --,
|
964
|
+
# the last stage is empty and mustn't be joined. It then
|
965
|
+
# serves only the purpose of holding the last stalk.
|
966
|
+
break if i == @stages.length - 1 and modification?
|
967
|
+
|
968
|
+
unless i.zero? then
|
969
|
+
clause << " #{stage.join_type} join "
|
970
|
+
end
|
971
|
+
|
972
|
+
clause << stage.table.name << " as " << stage.qualifier
|
973
|
+
|
974
|
+
unless i.zero? then
|
975
|
+
clause << " on %s.%s = %s.%s" % [
|
976
|
+
stage.parent.qualifier,
|
977
|
+
(stage.stalk.haunt || stage.stalk).name,
|
978
|
+
stage.qualifier, stage.stalk.ref.name,
|
979
|
+
]
|
980
|
+
end
|
981
|
+
end
|
982
|
+
return clause
|
983
|
+
end
|
984
|
+
|
985
|
+
def where_clause
|
986
|
+
raise 'assertion failed' if @exemplars.empty?
|
987
|
+
clause = "where "
|
988
|
+
@exemplars.each_with_index do |exemplar, i|
|
989
|
+
clause << " and " unless i.zero?
|
990
|
+
# The qualifier is only necessary if the clause has more
|
991
|
+
# than one stage.
|
992
|
+
if @stages.length > 1 then
|
993
|
+
# In the navigational model, the (primary) filter always
|
994
|
+
# lives in the zeroth stage.
|
995
|
+
clause << @stages[0].qualifier << "."
|
996
|
+
end
|
997
|
+
clause << exemplar.column.name << " [test #{i}]"
|
998
|
+
end
|
999
|
+
return clause
|
1000
|
+
end
|
1001
|
+
|
1002
|
+
# Prepare a [[select]] statement as an
|
1003
|
+
# [[Annotated_SQL_Template]]. If this expression represents a
|
1004
|
+
# query statement, the result will cover the whole query. If
|
1005
|
+
# it represents an update statement, the result will cover the
|
1006
|
+
# subquery that determines key value(s) of records in the last
|
1007
|
+
# table to update.
|
1008
|
+
def select
|
1009
|
+
qualifiers_needed =
|
1010
|
+
@stages.length != (modification? ? 2 : 1)
|
1011
|
+
sql_selectees = @selectees.map do |selectee|
|
1012
|
+
(qualifiers_needed ?
|
1013
|
+
selectee.stage.qualifier + "." : "") +
|
1014
|
+
(selectee.field.haunt || selectee.field).name
|
1015
|
+
end.join(', ')
|
1016
|
+
|
1017
|
+
outputs = {}
|
1018
|
+
@selectees.each do |selectee|
|
1019
|
+
outputs[selectee.output_name.to_sym] =
|
1020
|
+
selectee.interpretation
|
1021
|
+
end
|
1022
|
+
|
1023
|
+
sql = "select"
|
1024
|
+
sql << " distinct" if @type == :select_distinct
|
1025
|
+
sql << " " << sql_selectees << " " << from_clause
|
1026
|
+
|
1027
|
+
sql << " " << where_clause unless @exemplars.empty?
|
1028
|
+
|
1029
|
+
# Determine the shape of the table
|
1030
|
+
shape = 0
|
1031
|
+
shape |= QSF_MULTICOL if @multicolumn
|
1032
|
+
# If no [[unique]] exemplar field is specified or if any of
|
1033
|
+
# the joins is performed along a ghost field (i.e.,
|
1034
|
+
# possibly a 1->n reference), our result will have multiple
|
1035
|
+
# rows.
|
1036
|
+
shape |= QSF_MULTIROW \
|
1037
|
+
unless @exemplars.any?{|ex| ex.column.unique?} and
|
1038
|
+
!@stages[1 .. -1].any?{|stage| stage.stalk.ghost?}
|
1039
|
+
|
1040
|
+
return Ferret::Annotated_SQL_Template.new(sql,
|
1041
|
+
outputs, shape)
|
1042
|
+
end
|
1043
|
+
|
1044
|
+
include Ferret::Constants
|
1045
|
+
|
1046
|
+
def modification?
|
1047
|
+
case @type
|
1048
|
+
when :select, :select_distinct then
|
1049
|
+
return false
|
1050
|
+
when :update, :insert, :delete then
|
1051
|
+
return true
|
1052
|
+
else
|
1053
|
+
raise 'assertion failed'
|
1054
|
+
end
|
1055
|
+
end
|
1056
|
+
end
|
1057
|
+
|
1058
|
+
class Ferret::Expression_Parser
|
1059
|
+
attr_reader :expr
|
1060
|
+
|
1061
|
+
def initialize raw_expr, schema
|
1062
|
+
super()
|
1063
|
+
@raw_expr = raw_expr
|
1064
|
+
@schema = schema
|
1065
|
+
|
1066
|
+
@expr = Ferret::Expression.new
|
1067
|
+
@scanner = Ferret::Scanner.new @raw_expr
|
1068
|
+
|
1069
|
+
@first_star_offset = nil
|
1070
|
+
|
1071
|
+
first_table_name = @scanner.get_id 'table-name'
|
1072
|
+
@expr.stages[0].table = @schema[first_table_name] or
|
1073
|
+
ugh 'unknown-table',
|
1074
|
+
table: first_table_name,
|
1075
|
+
offset: @scanner.last_token_offset,
|
1076
|
+
expr: @raw_expr
|
1077
|
+
|
1078
|
+
@scanner.pass :':'
|
1079
|
+
|
1080
|
+
parenthesised = @scanner.pass? :'('
|
1081
|
+
loop do
|
1082
|
+
exemplar_escaped, exemplar_column_name =
|
1083
|
+
@scanner.get_optional_escaped_id 'column-expected'
|
1084
|
+
if exemplar_column_name then
|
1085
|
+
exemplar_column =
|
1086
|
+
@expr.stages[0].table[exemplar_column_name] or
|
1087
|
+
ugh 'unknown-field',
|
1088
|
+
field: exemplar_column_name,
|
1089
|
+
table: @expr.stages[0].table.name,
|
1090
|
+
role: 'key-field',
|
1091
|
+
offset: @scanner.last_token_offset,
|
1092
|
+
expr: @raw_expr
|
1093
|
+
# the key column must be a column, not a ghost field
|
1094
|
+
unless exemplar_column.column? then
|
1095
|
+
ugh 'not-a-column', field: exemplar_column.name,
|
1096
|
+
table: @expr.stages[0].table.name,
|
1097
|
+
offset: @scanner.last_token_offset,
|
1098
|
+
expr: @raw_expr
|
1099
|
+
end
|
1100
|
+
exemplar_interpretation = exemplar_escaped ?
|
1101
|
+
nil : exemplar_column.interpretation
|
1102
|
+
key_output_name =
|
1103
|
+
parse_optional_output_name_override ||
|
1104
|
+
exemplar_column_name
|
1105
|
+
@expr.exemplars.push Ferret::Exemplar.new(
|
1106
|
+
exemplar_column, exemplar_interpretation)
|
1107
|
+
@expr.selectees.push Ferret::Selectee.new(
|
1108
|
+
@expr.stages[0], exemplar_column,
|
1109
|
+
key_output_name, exemplar_interpretation)
|
1110
|
+
end
|
1111
|
+
break unless parenthesised and @scanner.pass? :','
|
1112
|
+
end
|
1113
|
+
@scanner.pass :')' if parenthesised
|
1114
|
+
|
1115
|
+
if @scanner.pass? :':' then
|
1116
|
+
# Colon without dereference: we should expect a fetch
|
1117
|
+
# verb.
|
1118
|
+
@expr.type = parse_fetch_verb
|
1119
|
+
else
|
1120
|
+
@scanner.pass :'->'
|
1121
|
+
|
1122
|
+
if @scanner.pass? :':' then
|
1123
|
+
# Colon past dereference: we should expect an update
|
1124
|
+
# verb.
|
1125
|
+
@expr.type = parse_update_verb
|
1126
|
+
else
|
1127
|
+
# Note that [[parse_stage]] can change [[@expr.type]]
|
1128
|
+
# if it meets the [[-> :]].
|
1129
|
+
parse_stage @expr.stages.last,
|
1130
|
+
parens: false
|
1131
|
+
end
|
1132
|
+
end
|
1133
|
+
|
1134
|
+
if @expr.modification? then
|
1135
|
+
if @first_star_offset then
|
1136
|
+
ugh 'star-in-modification',
|
1137
|
+
offset: @first_star_offset,
|
1138
|
+
expr: @raw_expr
|
1139
|
+
end
|
1140
|
+
end
|
1141
|
+
|
1142
|
+
@scanner.expected_eof!
|
1143
|
+
|
1144
|
+
if @expr.modification? then
|
1145
|
+
ugh 'multiple-columns-selected-in-modification' \
|
1146
|
+
if @expr.multicolumn
|
1147
|
+
end
|
1148
|
+
|
1149
|
+
unless @expr.multicolumn then
|
1150
|
+
# In single-column expressions, only the very last
|
1151
|
+
# selectee is actually selected.
|
1152
|
+
@expr.selectees[0 ... -1] = []
|
1153
|
+
end
|
1154
|
+
|
1155
|
+
@expr.assign_stage_qualifiers @schema.alias_generator
|
1156
|
+
return
|
1157
|
+
end
|
1158
|
+
|
1159
|
+
def start_subsequent_stage parent, stalk, join_type
|
1160
|
+
raise 'type mismatch' \
|
1161
|
+
unless parent.is_a? Ferret::Stage
|
1162
|
+
raise 'type mismatch' \
|
1163
|
+
unless stalk.is_a? Ferret::Field
|
1164
|
+
raise 'assertion failed' \
|
1165
|
+
unless [:left, :inner].include? join_type
|
1166
|
+
|
1167
|
+
# Note that we don't have the field's offset. But the
|
1168
|
+
# caller might.
|
1169
|
+
unless stalk.ref then
|
1170
|
+
ugh 'unable-to-dereference', field: field.name
|
1171
|
+
end
|
1172
|
+
|
1173
|
+
@expr.stages.push Ferret::Stage.new(
|
1174
|
+
parent, stalk, join_type)
|
1175
|
+
return
|
1176
|
+
end
|
1177
|
+
|
1178
|
+
def parse_stage stage, parens: false
|
1179
|
+
starred = false
|
1180
|
+
stage_empty = true
|
1181
|
+
loop do
|
1182
|
+
field_escaped, field_name =
|
1183
|
+
@scanner.get_optional_escaped_id 'field-expected'
|
1184
|
+
if field_name then
|
1185
|
+
field_offset = @scanner.last_token_offset
|
1186
|
+
|
1187
|
+
raise 'assertion failed' unless stage.table
|
1188
|
+
|
1189
|
+
field = stage.table[field_name] or
|
1190
|
+
ugh 'unknown-field', field: field_name,
|
1191
|
+
expr: @raw_expr,
|
1192
|
+
offset: field_offset
|
1193
|
+
|
1194
|
+
field_output_name =
|
1195
|
+
parse_optional_output_name_override ||
|
1196
|
+
field_name
|
1197
|
+
|
1198
|
+
# Has this column, or its name, been used already?
|
1199
|
+
(0 ... @expr.selectees.length).reverse_each do |i|
|
1200
|
+
selectee = @expr.selectees[i]
|
1201
|
+
if (selectee.stage == stage and
|
1202
|
+
selectee.field == field) or
|
1203
|
+
selectee.output_name == field_output_name then
|
1204
|
+
# Possible conflict detected.
|
1205
|
+
if selectee.star? then
|
1206
|
+
# The previous selectee was implicit, added due
|
1207
|
+
# to star expansion. We'll just discard it, for
|
1208
|
+
# explicit fields take precedence.
|
1209
|
+
@expr.selectees.delete_at i
|
1210
|
+
else
|
1211
|
+
ugh 'duplicate-field-in-stage',
|
1212
|
+
field: field.name,
|
1213
|
+
output_name: field_output_name,
|
1214
|
+
expr: @raw_expr,
|
1215
|
+
offset: field_offset
|
1216
|
+
end
|
1217
|
+
end
|
1218
|
+
end
|
1219
|
+
@expr.selectees.push Ferret::Selectee.new(
|
1220
|
+
stage, field,
|
1221
|
+
field_output_name, field_escaped ?
|
1222
|
+
nil : field.interpretation)
|
1223
|
+
stage_empty = false
|
1224
|
+
if @scanner.pass? :'(' then
|
1225
|
+
join_type = parse_optional_join_arrow or
|
1226
|
+
expected!('join-arrow',
|
1227
|
+
candidates: '-> <->')
|
1228
|
+
# If something goes wrong trying to start a new
|
1229
|
+
# stage, it must be the last field's fault.
|
1230
|
+
# ([[start_subsequent_stage]] won't attach the
|
1231
|
+
# offset to the ugh on its own just because it
|
1232
|
+
# doesn't _have_ the offset.)
|
1233
|
+
ugh? offset: field_offset do
|
1234
|
+
start_subsequent_stage stage, field, join_type
|
1235
|
+
end
|
1236
|
+
parse_stage @expr.stages.last,
|
1237
|
+
parens: true
|
1238
|
+
@scanner.pass :')'
|
1239
|
+
else
|
1240
|
+
if !parens and @scanner.pass? :':' then
|
1241
|
+
@expr.type = parse_fetch_verb
|
1242
|
+
break
|
1243
|
+
end
|
1244
|
+
if join_type = parse_optional_join_arrow then
|
1245
|
+
ugh? offset: field_offset do
|
1246
|
+
start_subsequent_stage stage, field, join_type
|
1247
|
+
end
|
1248
|
+
if !parens and @scanner.pass? :':' then
|
1249
|
+
@expr.type = parse_update_verb
|
1250
|
+
else
|
1251
|
+
parse_stage @expr.stages.last,
|
1252
|
+
parens: parens
|
1253
|
+
end
|
1254
|
+
break
|
1255
|
+
end
|
1256
|
+
end
|
1257
|
+
elsif @scanner.pass? :'*' then
|
1258
|
+
@first_star_offset ||= @scanner.last_token_offset
|
1259
|
+
# only one star per stage
|
1260
|
+
@scanner.expected! 'field-name' if starred
|
1261
|
+
starred = true
|
1262
|
+
@expr.multicolumn = true
|
1263
|
+
|
1264
|
+
stage.table.columns.each do |column|
|
1265
|
+
# We'll skip columns that have been selected (at
|
1266
|
+
# this stage) already, or columns whose names have
|
1267
|
+
# already been used.
|
1268
|
+
next if @expr.selectees.any? do |selectee|
|
1269
|
+
(selectee.stage == stage and
|
1270
|
+
selectee.column == column) or
|
1271
|
+
selectee.output_name == column.name
|
1272
|
+
end
|
1273
|
+
@expr.selectees.push Ferret::Selectee.new(
|
1274
|
+
stage, column,
|
1275
|
+
column.name, column.interpretation,
|
1276
|
+
true)
|
1277
|
+
end
|
1278
|
+
|
1279
|
+
# Note that [[->]] can not appear immediately
|
1280
|
+
# following a [[*]].
|
1281
|
+
break
|
1282
|
+
else
|
1283
|
+
if stage_empty then
|
1284
|
+
@scanner.expected! 'field-name'
|
1285
|
+
end
|
1286
|
+
break
|
1287
|
+
end
|
1288
|
+
|
1289
|
+
if @scanner.pass? :',' then
|
1290
|
+
@expr.multicolumn = true
|
1291
|
+
else
|
1292
|
+
break
|
1293
|
+
end
|
1294
|
+
end
|
1295
|
+
return
|
1296
|
+
end
|
1297
|
+
|
1298
|
+
def parse_optional_output_name_override
|
1299
|
+
if @scanner.pass? :"'" then
|
1300
|
+
override = @scanner.get_id 'output-name-override'
|
1301
|
+
@scanner.pass :"'"
|
1302
|
+
return override
|
1303
|
+
else
|
1304
|
+
return nil
|
1305
|
+
end
|
1306
|
+
end
|
1307
|
+
|
1308
|
+
def parse_optional_join_arrow
|
1309
|
+
return :left if @scanner.pass? :'->'
|
1310
|
+
return :inner if @scanner.pass? :'<->'
|
1311
|
+
return nil
|
1312
|
+
end
|
1313
|
+
|
1314
|
+
def parse_fetch_verb
|
1315
|
+
verb = @scanner.get_id 'fetch-verb'
|
1316
|
+
case verb
|
1317
|
+
when 'select' then
|
1318
|
+
return @scanner.pass?('distinct') ?
|
1319
|
+
:select_distinct : :select
|
1320
|
+
when 'distinct' then
|
1321
|
+
return :select_distinct
|
1322
|
+
else
|
1323
|
+
ugh 'unknown-fetch-verb',
|
1324
|
+
got: verb,
|
1325
|
+
input: @expr,
|
1326
|
+
offset: @scanner.last_token_offset
|
1327
|
+
end
|
1328
|
+
end
|
1329
|
+
|
1330
|
+
def parse_update_verb
|
1331
|
+
verb = @scanner.get_id 'update-verb'
|
1332
|
+
case verb
|
1333
|
+
when 'update', 'set' then
|
1334
|
+
return :update
|
1335
|
+
when 'delete' then
|
1336
|
+
return :delete
|
1337
|
+
else
|
1338
|
+
ugh 'unknown-update-verb',
|
1339
|
+
got: verb,
|
1340
|
+
input: @expr,
|
1341
|
+
offset: @scanner.last_token_offset
|
1342
|
+
end
|
1343
|
+
end
|
1344
|
+
end
|
1345
|
+
|
1346
|
+
class Ferret::Stage
|
1347
|
+
attr_reader :parent
|
1348
|
+
attr_reader :stalk
|
1349
|
+
attr_reader :join_type
|
1350
|
+
|
1351
|
+
attr_accessor :table
|
1352
|
+
attr_accessor :qualifier
|
1353
|
+
|
1354
|
+
def initialize parent, stalk, join_type
|
1355
|
+
raise 'type mismatch' \
|
1356
|
+
unless parent.nil? or parent.is_a? Ferret::Stage
|
1357
|
+
raise 'type mismatch' \
|
1358
|
+
unless parent.nil? ?
|
1359
|
+
stalk.nil? : stalk.is_a?(Ferret::Field)
|
1360
|
+
raise 'assertion failed' \
|
1361
|
+
unless [:left, :inner].include? join_type
|
1362
|
+
super()
|
1363
|
+
@parent = parent
|
1364
|
+
@stalk = stalk
|
1365
|
+
@join_type = join_type
|
1366
|
+
|
1367
|
+
# If we have a stalk, it identifies this stage's table.
|
1368
|
+
# If not (which only happens for the very first stage),
|
1369
|
+
# the parser will use [[table=]] to set the stage's table
|
1370
|
+
# a bit later.
|
1371
|
+
@table = stalk && stalk.ref.table
|
1372
|
+
@qualifier = nil
|
1373
|
+
return
|
1374
|
+
end
|
1375
|
+
end
|
1376
|
+
|
1377
|
+
class Ferret::Field
|
1378
|
+
def inspect
|
1379
|
+
result = "#<Ferret::Field #{@table.name}.#{name}: "
|
1380
|
+
if primary_key? then
|
1381
|
+
result << 'primary key '
|
1382
|
+
else
|
1383
|
+
result << 'optional ' if optional?
|
1384
|
+
result << 'unique ' if unique?
|
1385
|
+
end
|
1386
|
+
if reference? then
|
1387
|
+
result << 'unconstrained ' if unconstrained?
|
1388
|
+
result << "ghost #{@haunt.name} " if ghost?
|
1389
|
+
result << 'ref %s(%s)' % [ref.table.name, ref.name]
|
1390
|
+
else
|
1391
|
+
result << (interpretation || type).to_s
|
1392
|
+
end
|
1393
|
+
# Note that [[default]] is an unsanitised, unprocessed
|
1394
|
+
# string extracted from the schema. In pathological cases,
|
1395
|
+
# it can potentially contain the [[>]] character.
|
1396
|
+
result << " = #{default}" if default
|
1397
|
+
result << '>'
|
1398
|
+
end
|
1399
|
+
|
1400
|
+
attr_reader :table
|
1401
|
+
|
1402
|
+
attr_reader :name
|
1403
|
+
|
1404
|
+
attr_reader :type
|
1405
|
+
|
1406
|
+
attr_reader :interpretation
|
1407
|
+
|
1408
|
+
def unique?
|
1409
|
+
return (@flags & (FF_PRIMARY_KEY | FF_EXPL_UNIQUE)) != 0
|
1410
|
+
end
|
1411
|
+
|
1412
|
+
def optional?
|
1413
|
+
return (@flags & FF_OPTIONAL) != 0
|
1414
|
+
end
|
1415
|
+
|
1416
|
+
def primary_key?
|
1417
|
+
return (@flags & FF_PRIMARY_KEY) != 0
|
1418
|
+
end
|
1419
|
+
|
1420
|
+
def unconstrained?
|
1421
|
+
return (@flags & FF_UNCONSTRAINED) != 0
|
1422
|
+
end
|
1423
|
+
|
1424
|
+
def reference?
|
1425
|
+
return (@flags & FF_REFERENCE) != 0
|
1426
|
+
end
|
1427
|
+
|
1428
|
+
def ghost?
|
1429
|
+
return (@flags & FF_GHOST) != 0
|
1430
|
+
end
|
1431
|
+
|
1432
|
+
def column?
|
1433
|
+
return (@flags & FF_GHOST) == 0
|
1434
|
+
end
|
1435
|
+
|
1436
|
+
attr_reader :haunt
|
1437
|
+
|
1438
|
+
attr_reader :default
|
1439
|
+
|
1440
|
+
attr_reader :ref
|
1441
|
+
|
1442
|
+
# Note that the parser does not look up the referred and
|
1443
|
+
# haunted columns, for at the parsing time, not all the
|
1444
|
+
# columns are yet available so trying to look up forward
|
1445
|
+
# references would spuriously fail. Instead, it creates
|
1446
|
+
# 'relocation thunks' and [[yield]]:s them to the caller, who
|
1447
|
+
# must arrange to have them called (in the same order as they
|
1448
|
+
# were [[yield]]:ed) after the whole schema has been loaded
|
1449
|
+
# and which will perform these lookups and fill in the
|
1450
|
+
# corresponding slots in the structure.
|
1451
|
+
def initialize table, name, spec, &thunk
|
1452
|
+
raise 'type mismatch' unless table.is_a? Ferret::Table
|
1453
|
+
raise 'type mismatch' unless name.is_a? String
|
1454
|
+
raise 'type mismatch' unless spec.is_a? String
|
1455
|
+
super()
|
1456
|
+
@table = table
|
1457
|
+
@name = name
|
1458
|
+
unless spec.strip =~ %r{\A
|
1459
|
+
(
|
1460
|
+
| (?<primary_key> \b primary \s+ key \s* ,)
|
1461
|
+
| (?<unique> \b unique \b)
|
1462
|
+
| (?<optional> \b optional \b)
|
1463
|
+
| \b ghost \b \s* (?<haunt> \b \w+ \b)
|
1464
|
+
)\s*
|
1465
|
+
( (?<type> \b \w+ \b)
|
1466
|
+
| (?<unconstrained> \b unconstrained \b \s*)?
|
1467
|
+
\b ref \b \s* (?<ref_table> \w+)
|
1468
|
+
( \s* \( \s* (?<ref_field> \w+) \s* \) )?
|
1469
|
+
)
|
1470
|
+
( \s* = \s* (?<default> [^\s].*) )?
|
1471
|
+
\Z}x then
|
1472
|
+
ugh 'invalid-field-specification',
|
1473
|
+
input: spec
|
1474
|
+
end
|
1475
|
+
|
1476
|
+
unless $~['haunt'] then
|
1477
|
+
# Do we know the type?
|
1478
|
+
if $~['type'] and !%w{
|
1479
|
+
integer real varchar text blob iso8601
|
1480
|
+
unix_time subsecond_unix_time
|
1481
|
+
json pretty_json yaml
|
1482
|
+
ruby_marshal packed_hex}.include? $~['type'] then
|
1483
|
+
ugh 'unknown-type', type: $~['type']
|
1484
|
+
end
|
1485
|
+
else
|
1486
|
+
# The regex above is a bit too permissive.
|
1487
|
+
if $~['type'] or $~['unconstrained'] or $~['default'] then
|
1488
|
+
ugh 'invalid-field-specification',
|
1489
|
+
input: spec
|
1490
|
+
end
|
1491
|
+
end
|
1492
|
+
|
1493
|
+
if $~['primary_key'] and
|
1494
|
+
($~['ref_table'] or $~['default']) then
|
1495
|
+
ugh 'invalid-field-specification',
|
1496
|
+
input: spec
|
1497
|
+
end
|
1498
|
+
|
1499
|
+
@flags = 0
|
1500
|
+
@flags |= FF_PRIMARY_KEY if $~['primary_key']
|
1501
|
+
@flags |= FF_EXPL_UNIQUE if $~['unique']
|
1502
|
+
@flags |= FF_OPTIONAL if $~['optional']
|
1503
|
+
|
1504
|
+
# The current [[$~]] is unlikely to survive until the
|
1505
|
+
# relocation thunk gets called, so we'll have to copy
|
1506
|
+
# [[ref_table]] and [[ref_field]] out of it, into local
|
1507
|
+
# variables.
|
1508
|
+
if ref_table_name = $~['ref_table'] then
|
1509
|
+
@flags |= FF_REFERENCE
|
1510
|
+
ref_field_name = $~['ref_field']
|
1511
|
+
yield(proc do |schema|
|
1512
|
+
raise 'assertion failed' if @ref
|
1513
|
+
ref_table = schema[ref_table_name]
|
1514
|
+
ugh 'unknown-table', table: ref_table_name \
|
1515
|
+
unless ref_table
|
1516
|
+
ugh? referring_field: @name,
|
1517
|
+
referring_field_table: @table.name do
|
1518
|
+
if ref_field_name then
|
1519
|
+
@ref = ref_table[ref_field_name] or
|
1520
|
+
ugh 'unknown-field', field: ref_field_name,
|
1521
|
+
table: ref_table.name,
|
1522
|
+
significance: 'referred'
|
1523
|
+
else
|
1524
|
+
@ref = ref_table.primary_key or
|
1525
|
+
ugh 'no-primary-key', table: ref_table.name,
|
1526
|
+
significance: 'referred'
|
1527
|
+
end
|
1528
|
+
ugh 'not-a-column', field: @ref.name,
|
1529
|
+
table: ref.table.name,
|
1530
|
+
significance: 'referred' \
|
1531
|
+
unless @ref.column?
|
1532
|
+
end
|
1533
|
+
@type = @ref.type
|
1534
|
+
end)
|
1535
|
+
else
|
1536
|
+
@type = $~['type']
|
1537
|
+
end
|
1538
|
+
|
1539
|
+
if haunt = $~['haunt'] then
|
1540
|
+
@flags |= FF_GHOST
|
1541
|
+
yield(proc do |schema|
|
1542
|
+
ugh? significance: 'relied-on-by-ghost-field',
|
1543
|
+
ghost_field: @name do
|
1544
|
+
@haunt = @table[haunt]
|
1545
|
+
unless @haunt then
|
1546
|
+
ugh 'unknown-field', field: haunt
|
1547
|
+
end
|
1548
|
+
unless @haunt.column? then
|
1549
|
+
ugh 'not-a-column', field: @haunt.name
|
1550
|
+
end
|
1551
|
+
@type ||= @haunt.type
|
1552
|
+
unless @haunt.type == @type then
|
1553
|
+
ugh 'ghost-field-type-mismatch',
|
1554
|
+
field: @name,
|
1555
|
+
table: @table.name,
|
1556
|
+
type: @type.downcase,
|
1557
|
+
haunted_column: @haunt.name,
|
1558
|
+
haunted_column_type: @haunt.type.downcase
|
1559
|
+
end
|
1560
|
+
end
|
1561
|
+
end)
|
1562
|
+
end
|
1563
|
+
|
1564
|
+
@flags |= FF_UNCONSTRAINED if $~['unconstrained']
|
1565
|
+
@default = $~['default']
|
1566
|
+
|
1567
|
+
if @type then
|
1568
|
+
# [[@type]] can be [[nil]] if it's a reference field.
|
1569
|
+
# Then, the type and interpretation will be later copied
|
1570
|
+
# from the referred column.
|
1571
|
+
case @type.downcase
|
1572
|
+
when 'iso8601', 'json' then
|
1573
|
+
@interpretation = @type.downcase.to_sym
|
1574
|
+
@type = 'varchar'
|
1575
|
+
when 'yaml', 'pretty_json' then
|
1576
|
+
@interpretation = @type.downcase.to_sym
|
1577
|
+
@type = 'text'
|
1578
|
+
when 'ruby_marshal' then
|
1579
|
+
@interpretation = @type.downcase.to_sym
|
1580
|
+
@type = 'blob'
|
1581
|
+
when 'unix_time' then
|
1582
|
+
@interpretation = @type.downcase.to_sym
|
1583
|
+
@type = 'integer'
|
1584
|
+
when 'subsecond_unix_time' then
|
1585
|
+
@interpretation = @type.downcase.to_sym
|
1586
|
+
@type = 'real'
|
1587
|
+
else
|
1588
|
+
@interpretation = nil
|
1589
|
+
end
|
1590
|
+
end
|
1591
|
+
|
1592
|
+
return
|
1593
|
+
end
|
1594
|
+
|
1595
|
+
# [[Ferret::Field]] flags
|
1596
|
+
FF_PRIMARY_KEY = 0x01
|
1597
|
+
FF_EXPL_UNIQUE = 0x02
|
1598
|
+
FF_OPTIONAL = 0x04
|
1599
|
+
FF_UNCONSTRAINED = 0x08
|
1600
|
+
FF_REFERENCE = 0x10
|
1601
|
+
FF_GHOST = 0x20
|
1602
|
+
|
1603
|
+
def sql_to_declare
|
1604
|
+
sql = "#@name #@type"
|
1605
|
+
if primary_key? then
|
1606
|
+
sql << " primary key"
|
1607
|
+
else
|
1608
|
+
sql << " unique" if unique?
|
1609
|
+
sql << " not null" unless optional?
|
1610
|
+
sql << " default #@default" if default
|
1611
|
+
end
|
1612
|
+
if reference? and !unconstrained? then
|
1613
|
+
sql << "\n references %s(%s)" %
|
1614
|
+
[@ref.table.name, @ref.name]
|
1615
|
+
end
|
1616
|
+
return sql
|
1617
|
+
end
|
1618
|
+
end
|
1619
|
+
|
1620
|
+
# [[sql]] is a [[String]] of the SQL template together with
|
1621
|
+
# placeholders. [[outputs]] is [[nil]] if this SQL is not a
|
1622
|
+
# query, or a [[Hash]] containing name->interpretation
|
1623
|
+
# mappings (in the order the values are [[select]]:ed by this
|
1624
|
+
# SQL statement) if it is. [[shape]] describes the expected
|
1625
|
+
# result set that would result, assuming all the inputs are
|
1626
|
+
# single values.
|
1627
|
+
Ferret::Annotated_SQL_Template =
|
1628
|
+
Struct.new :sql, :outputs, :shape
|
1629
|
+
|
1630
|
+
class Ferret::Selectee
|
1631
|
+
attr_reader :stage
|
1632
|
+
attr_reader :field
|
1633
|
+
attr_reader :output_name
|
1634
|
+
attr_reader :interpretation
|
1635
|
+
|
1636
|
+
def initialize stage, field,
|
1637
|
+
output_name, interpretation,
|
1638
|
+
star_p = false
|
1639
|
+
raise 'type mismatch' unless field.is_a? Ferret::Field
|
1640
|
+
raise 'type mismatch' unless output_name.is_a? String
|
1641
|
+
super()
|
1642
|
+
@stage = stage
|
1643
|
+
@field = field
|
1644
|
+
@output_name = output_name
|
1645
|
+
@interpretation = interpretation
|
1646
|
+
@star_p = star_p
|
1647
|
+
return
|
1648
|
+
end
|
1649
|
+
|
1650
|
+
def star?
|
1651
|
+
return @star_p
|
1652
|
+
end
|
1653
|
+
end
|
1654
|
+
|
1655
|
+
class Ferret::Exemplar
|
1656
|
+
attr_reader :column
|
1657
|
+
attr_reader :interpretation
|
1658
|
+
|
1659
|
+
def initialize column, interpretation
|
1660
|
+
raise 'type mismatch' unless column.is_a? Ferret::Field
|
1661
|
+
raise 'assertion failed' unless column.column?
|
1662
|
+
raise 'type mismatch' \
|
1663
|
+
unless interpretation.nil? \
|
1664
|
+
or interpretation.is_a? Symbol
|
1665
|
+
super()
|
1666
|
+
@column = column
|
1667
|
+
@interpretation = interpretation
|
1668
|
+
return
|
1669
|
+
end
|
1670
|
+
end
|
1671
|
+
|
1672
|
+
class Ferret::Parameter_Collector < Array
|
1673
|
+
# [[parameter]] can be a plain value, a collection
|
1674
|
+
# ([[Enumerator]]) of values, or a [[nil]].
|
1675
|
+
def feed parameter, exemplar_spec
|
1676
|
+
raise 'type mismatch' \
|
1677
|
+
unless exemplar_spec.is_a? Ferret::Exemplar
|
1678
|
+
if parameter.nil? and exemplar_spec.column.optional? then
|
1679
|
+
test = "is " + _feed(nil)
|
1680
|
+
selects_one_p = false
|
1681
|
+
else
|
1682
|
+
*exemplar_values = *parameter # force to array
|
1683
|
+
exemplar_values.map! do |value|
|
1684
|
+
Ferret.deterpret exemplar_spec.interpretation, value
|
1685
|
+
end
|
1686
|
+
if exemplar_values.length != 1 then
|
1687
|
+
test = "in ("
|
1688
|
+
exemplar_values.each_with_index do |value, i|
|
1689
|
+
test << ", " unless i.zero?
|
1690
|
+
test << _feed(value)
|
1691
|
+
end
|
1692
|
+
test << ")"
|
1693
|
+
selects_one_p = false
|
1694
|
+
else
|
1695
|
+
test = "= " + _feed(exemplar_values.first)
|
1696
|
+
selects_one_p = exemplar_spec.column.unique?
|
1697
|
+
end
|
1698
|
+
end
|
1699
|
+
return test, selects_one_p
|
1700
|
+
end
|
1701
|
+
|
1702
|
+
# Add the given [[parameter]] to this collector and return a
|
1703
|
+
# string containing its placeholder, in the form of colon
|
1704
|
+
# followed by a sequential number (0-based).
|
1705
|
+
def _feed parameter
|
1706
|
+
placeholder = length
|
1707
|
+
push parameter
|
1708
|
+
return ":#{placeholder}"
|
1709
|
+
end
|
1710
|
+
private :_feed
|
1711
|
+
|
1712
|
+
def to_hash
|
1713
|
+
h = {}
|
1714
|
+
each_with_index do |parameter, i|
|
1715
|
+
h[i.to_s.to_sym] = parameter
|
1716
|
+
end
|
1717
|
+
return h
|
1718
|
+
end
|
1719
|
+
end
|