@useairfoil/flight 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md ADDED
@@ -0,0 +1 @@
1
+ # @airfoil/flight
@@ -0,0 +1,524 @@
1
+ /*
2
+ * Licensed to the Apache Software Foundation (ASF) under one
3
+ * or more contributor license agreements. See the NOTICE file
4
+ * distributed with this work for additional information
5
+ * regarding copyright ownership. The ASF licenses this file
6
+ * to you under the Apache License, Version 2.0 (the
7
+ * "License"); you may not use this file except in compliance
8
+ * with the License. You may obtain a copy of the License at
9
+ * <p>
10
+ * http://www.apache.org/licenses/LICENSE-2.0
11
+ * <p>
12
+ * Unless required by applicable law or agreed to in writing, software
13
+ * distributed under the License is distributed on an "AS IS" BASIS,
14
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
15
+ * See the License for the specific language governing permissions and
16
+ * limitations under the License.
17
+ */
18
+
19
+ syntax = "proto3";
20
+ import "google/protobuf/timestamp.proto";
21
+
22
+ package arrow.flight.protocol;
23
+
24
+ /*
25
+ * A flight service is an endpoint for retrieving or storing Arrow data. A
26
+ * flight service can expose one or more predefined endpoints that can be
27
+ * accessed using the Arrow Flight Protocol. Additionally, a flight service
28
+ * can expose a set of actions that are available.
29
+ */
30
+ service FlightService {
31
+
32
+ /*
33
+ * Handshake between client and server. Depending on the server, the
34
+ * handshake may be required to determine the token that should be used for
35
+ * future operations. Both request and response are streams to allow multiple
36
+ * round-trips depending on auth mechanism.
37
+ */
38
+ rpc Handshake(stream HandshakeRequest) returns (stream HandshakeResponse) {}
39
+
40
+ /*
41
+ * Get a list of available streams given a particular criteria. Most flight
42
+ * services will expose one or more streams that are readily available for
43
+ * retrieval. This api allows listing the streams available for
44
+ * consumption. A user can also provide a criteria. The criteria can limit
45
+ * the subset of streams that can be listed via this interface. Each flight
46
+ * service allows its own definition of how to consume criteria.
47
+ */
48
+ rpc ListFlights(Criteria) returns (stream FlightInfo) {}
49
+
50
+ /*
51
+ * For a given FlightDescriptor, get information about how the flight can be
52
+ * consumed. This is a useful interface if the consumer of the interface
53
+ * already can identify the specific flight to consume. This interface can
54
+ * also allow a consumer to generate a flight stream through a specified
55
+ * descriptor. For example, a flight descriptor might be something that
56
+ * includes a SQL statement or a Pickled Python operation that will be
57
+ * executed. In those cases, the descriptor will not be previously available
58
+ * within the list of available streams provided by ListFlights but will be
59
+ * available for consumption for the duration defined by the specific flight
60
+ * service.
61
+ */
62
+ rpc GetFlightInfo(FlightDescriptor) returns (FlightInfo) {}
63
+
64
+ /*
65
+ * For a given FlightDescriptor, start a query and get information
66
+ * to poll its execution status. This is a useful interface if the
67
+ * query may be a long-running query. The first PollFlightInfo call
68
+ * should return as quickly as possible. (GetFlightInfo doesn't
69
+ * return until the query is complete.)
70
+ *
71
+ * A client can consume any available results before
72
+ * the query is completed. See PollInfo.info for details.
73
+ *
74
+ * A client can poll the updated query status by calling
75
+ * PollFlightInfo() with PollInfo.flight_descriptor. A server
76
+ * should not respond until the result would be different from last
77
+ * time. That way, the client can "long poll" for updates
78
+ * without constantly making requests. Clients can set a short timeout
79
+ * to avoid blocking calls if desired.
80
+ *
81
+ * A client can't use PollInfo.flight_descriptor after
82
+ * PollInfo.expiration_time passes. A server might not accept the
83
+ * retry descriptor anymore and the query may be cancelled.
84
+ *
85
+ * A client may use the CancelFlightInfo action with
86
+ * PollInfo.info to cancel the running query.
87
+ */
88
+ rpc PollFlightInfo(FlightDescriptor) returns (PollInfo) {}
89
+
90
+ /*
91
+ * For a given FlightDescriptor, get the Schema as described in Schema.fbs::Schema
92
+ * This is used when a consumer needs the Schema of flight stream. Similar to
93
+ * GetFlightInfo this interface may generate a new flight that was not previously
94
+ * available in ListFlights.
95
+ */
96
+ rpc GetSchema(FlightDescriptor) returns (SchemaResult) {}
97
+
98
+ /*
99
+ * Retrieve a single stream associated with a particular descriptor
100
+ * associated with the referenced ticket. A Flight can be composed of one or
101
+ * more streams where each stream can be retrieved using a separate opaque
102
+ * ticket that the flight service uses for managing a collection of streams.
103
+ */
104
+ rpc DoGet(Ticket) returns (stream FlightData) {}
105
+
106
+ /*
107
+ * Push a stream to the flight service associated with a particular
108
+ * flight stream. This allows a client of a flight service to upload a stream
109
+ * of data. Depending on the particular flight service, a client consumer
110
+ * could be allowed to upload a single stream per descriptor or an unlimited
111
+ * number. In the latter, the service might implement a 'seal' action that
112
+ * can be applied to a descriptor once all streams are uploaded.
113
+ */
114
+ rpc DoPut(stream FlightData) returns (stream PutResult) {}
115
+
116
+ /*
117
+ * Open a bidirectional data channel for a given descriptor. This
118
+ * allows clients to send and receive arbitrary Arrow data and
119
+ * application-specific metadata in a single logical stream. In
120
+ * contrast to DoGet/DoPut, this is more suited for clients
121
+ * offloading computation (rather than storage) to a Flight service.
122
+ */
123
+ rpc DoExchange(stream FlightData) returns (stream FlightData) {}
124
+
125
+ /*
126
+ * Flight services can support an arbitrary number of simple actions in
127
+ * addition to the possible ListFlights, GetFlightInfo, DoGet, DoPut
128
+ * operations that are potentially available. DoAction allows a flight client
129
+ * to do a specific action against a flight service. An action includes
130
+ * opaque request and response objects that are specific to the type action
131
+ * being undertaken.
132
+ */
133
+ rpc DoAction(Action) returns (stream Result) {}
134
+
135
+ /*
136
+ * A flight service exposes all of the available action types that it has
137
+ * along with descriptions. This allows different flight consumers to
138
+ * understand the capabilities of the flight service.
139
+ */
140
+ rpc ListActions(Empty) returns (stream ActionType) {}
141
+
142
+ }
143
+
144
+ /*
145
+ * The request that a client provides to a server on handshake.
146
+ */
147
+ message HandshakeRequest {
148
+
149
+ /*
150
+ * A defined protocol version
151
+ */
152
+ uint64 protocol_version = 1;
153
+
154
+ /*
155
+ * Arbitrary auth/handshake info.
156
+ */
157
+ bytes payload = 2;
158
+ }
159
+
160
+ message HandshakeResponse {
161
+
162
+ /*
163
+ * A defined protocol version
164
+ */
165
+ uint64 protocol_version = 1;
166
+
167
+ /*
168
+ * Arbitrary auth/handshake info.
169
+ */
170
+ bytes payload = 2;
171
+ }
172
+
173
+ /*
174
+ * A message for doing simple auth.
175
+ */
176
+ message BasicAuth {
177
+ string username = 2;
178
+ string password = 3;
179
+ }
180
+
181
+ message Empty {}
182
+
183
+ /*
184
+ * Describes an available action, including both the name used for execution
185
+ * along with a short description of the purpose of the action.
186
+ */
187
+ message ActionType {
188
+ string type = 1;
189
+ string description = 2;
190
+ }
191
+
192
+ /*
193
+ * A service specific expression that can be used to return a limited set
194
+ * of available Arrow Flight streams.
195
+ */
196
+ message Criteria {
197
+ bytes expression = 1;
198
+ }
199
+
200
+ /*
201
+ * An opaque action specific for the service.
202
+ */
203
+ message Action {
204
+ string type = 1;
205
+ bytes body = 2;
206
+ }
207
+
208
+ /*
209
+ * The request of the CancelFlightInfo action.
210
+ *
211
+ * The request should be stored in Action.body.
212
+ */
213
+ message CancelFlightInfoRequest {
214
+ FlightInfo info = 1;
215
+ }
216
+
217
+ /*
218
+ * The request of the RenewFlightEndpoint action.
219
+ *
220
+ * The request should be stored in Action.body.
221
+ */
222
+ message RenewFlightEndpointRequest {
223
+ FlightEndpoint endpoint = 1;
224
+ }
225
+
226
+ /*
227
+ * An opaque result returned after executing an action.
228
+ */
229
+ message Result {
230
+ bytes body = 1;
231
+ }
232
+
233
+ /*
234
+ * The result of a cancel operation.
235
+ *
236
+ * This is used by CancelFlightInfoResult.status.
237
+ */
238
+ enum CancelStatus {
239
+ // The cancellation status is unknown. Servers should avoid using
240
+ // this value (send a NOT_FOUND error if the requested query is
241
+ // not known). Clients can retry the request.
242
+ CANCEL_STATUS_UNSPECIFIED = 0;
243
+ // The cancellation request is complete. Subsequent requests with
244
+ // the same payload may return CANCELLED or a NOT_FOUND error.
245
+ CANCEL_STATUS_CANCELLED = 1;
246
+ // The cancellation request is in progress. The client may retry
247
+ // the cancellation request.
248
+ CANCEL_STATUS_CANCELLING = 2;
249
+ // The query is not cancellable. The client should not retry the
250
+ // cancellation request.
251
+ CANCEL_STATUS_NOT_CANCELLABLE = 3;
252
+ }
253
+
254
+ /*
255
+ * The result of the CancelFlightInfo action.
256
+ *
257
+ * The result should be stored in Result.body.
258
+ */
259
+ message CancelFlightInfoResult {
260
+ CancelStatus status = 1;
261
+ }
262
+
263
+ /*
264
+ * Wrap the result of a getSchema call
265
+ */
266
+ message SchemaResult {
267
+ // The schema of the dataset in its IPC form:
268
+ // 4 bytes - an optional IPC_CONTINUATION_TOKEN prefix
269
+ // 4 bytes - the byte length of the payload
270
+ // a flatbuffer Message whose header is the Schema
271
+ bytes schema = 1;
272
+ }
273
+
274
+ /*
275
+ * The name or tag for a Flight. May be used as a way to retrieve or generate
276
+ * a flight or be used to expose a set of previously defined flights.
277
+ */
278
+ message FlightDescriptor {
279
+
280
+ /*
281
+ * Describes what type of descriptor is defined.
282
+ */
283
+ enum DescriptorType {
284
+
285
+ // Protobuf pattern, not used.
286
+ UNKNOWN = 0;
287
+
288
+ /*
289
+ * A named path that identifies a dataset. A path is composed of a string
290
+ * or list of strings describing a particular dataset. This is conceptually
291
+ * similar to a path inside a filesystem.
292
+ */
293
+ PATH = 1;
294
+
295
+ /*
296
+ * An opaque command to generate a dataset.
297
+ */
298
+ CMD = 2;
299
+ }
300
+
301
+ DescriptorType type = 1;
302
+
303
+ /*
304
+ * Opaque value used to express a command. Should only be defined when
305
+ * type = CMD.
306
+ */
307
+ bytes cmd = 2;
308
+
309
+ /*
310
+ * List of strings identifying a particular dataset. Should only be defined
311
+ * when type = PATH.
312
+ */
313
+ repeated string path = 3;
314
+ }
315
+
316
+ /*
317
+ * The access coordinates for retrieval of a dataset. With a FlightInfo, a
318
+ * consumer is able to determine how to retrieve a dataset.
319
+ */
320
+ message FlightInfo {
321
+ // The schema of the dataset in its IPC form:
322
+ // 4 bytes - an optional IPC_CONTINUATION_TOKEN prefix
323
+ // 4 bytes - the byte length of the payload
324
+ // a flatbuffer Message whose header is the Schema
325
+ bytes schema = 1;
326
+
327
+ /*
328
+ * The descriptor associated with this info.
329
+ */
330
+ FlightDescriptor flight_descriptor = 2;
331
+
332
+ /*
333
+ * A list of endpoints associated with the flight. To consume the
334
+ * whole flight, all endpoints (and hence all Tickets) must be
335
+ * consumed. Endpoints can be consumed in any order.
336
+ *
337
+ * In other words, an application can use multiple endpoints to
338
+ * represent partitioned data.
339
+ *
340
+ * If the returned data has an ordering, an application can use
341
+ * "FlightInfo.ordered = true" or should return the all data in a
342
+ * single endpoint. Otherwise, there is no ordering defined on
343
+ * endpoints or the data within.
344
+ *
345
+ * A client can read ordered data by reading data from returned
346
+ * endpoints, in order, from front to back.
347
+ *
348
+ * Note that a client may ignore "FlightInfo.ordered = true". If an
349
+ * ordering is important for an application, an application must
350
+ * choose one of them:
351
+ *
352
+ * * An application requires that all clients must read data in
353
+ * returned endpoints order.
354
+ * * An application must return the all data in a single endpoint.
355
+ */
356
+ repeated FlightEndpoint endpoint = 3;
357
+
358
+ // Set these to -1 if unknown.
359
+ int64 total_records = 4;
360
+ int64 total_bytes = 5;
361
+
362
+ /*
363
+ * FlightEndpoints are in the same order as the data.
364
+ */
365
+ bool ordered = 6;
366
+
367
+ /*
368
+ * Application-defined metadata.
369
+ *
370
+ * There is no inherent or required relationship between this
371
+ * and the app_metadata fields in the FlightEndpoints or resulting
372
+ * FlightData messages. Since this metadata is application-defined,
373
+ * a given application could define there to be a relationship,
374
+ * but there is none required by the spec.
375
+ */
376
+ bytes app_metadata = 7;
377
+ }
378
+
379
+ /*
380
+ * The information to process a long-running query.
381
+ */
382
+ message PollInfo {
383
+ /*
384
+ * The currently available results.
385
+ *
386
+ * If "flight_descriptor" is not specified, the query is complete
387
+ * and "info" specifies all results. Otherwise, "info" contains
388
+ * partial query results.
389
+ *
390
+ * Note that each PollInfo response contains a complete
391
+ * FlightInfo (not just the delta between the previous and current
392
+ * FlightInfo).
393
+ *
394
+ * Subsequent PollInfo responses may only append new endpoints to
395
+ * info.
396
+ *
397
+ * Clients can begin fetching results via DoGet(Ticket) with the
398
+ * ticket in the info before the query is
399
+ * completed. FlightInfo.ordered is also valid.
400
+ */
401
+ FlightInfo info = 1;
402
+
403
+ /*
404
+ * The descriptor the client should use on the next try.
405
+ * If unset, the query is complete.
406
+ */
407
+ FlightDescriptor flight_descriptor = 2;
408
+
409
+ /*
410
+ * Query progress. If known, must be in [0.0, 1.0] but need not be
411
+ * monotonic or nondecreasing. If unknown, do not set.
412
+ */
413
+ optional double progress = 3;
414
+
415
+ /*
416
+ * Expiration time for this request. After this passes, the server
417
+ * might not accept the retry descriptor anymore (and the query may
418
+ * be cancelled). This may be updated on a call to PollFlightInfo.
419
+ */
420
+ google.protobuf.Timestamp expiration_time = 4;
421
+ }
422
+
423
+ /*
424
+ * A particular stream or split associated with a flight.
425
+ */
426
+ message FlightEndpoint {
427
+
428
+ /*
429
+ * Token used to retrieve this stream.
430
+ */
431
+ Ticket ticket = 1;
432
+
433
+ /*
434
+ * A list of URIs where this ticket can be redeemed via DoGet().
435
+ *
436
+ * If the list is empty, the expectation is that the ticket can only
437
+ * be redeemed on the current service where the ticket was
438
+ * generated.
439
+ *
440
+ * If the list is not empty, the expectation is that the ticket can
441
+ * be redeemed at any of the locations, and that the data returned
442
+ * will be equivalent. In this case, the ticket may only be redeemed
443
+ * at one of the given locations, and not (necessarily) on the
444
+ * current service.
445
+ *
446
+ * In other words, an application can use multiple locations to
447
+ * represent redundant and/or load balanced services.
448
+ */
449
+ repeated Location location = 2;
450
+
451
+ /*
452
+ * Expiration time of this stream. If present, clients may assume
453
+ * they can retry DoGet requests. Otherwise, it is
454
+ * application-defined whether DoGet requests may be retried.
455
+ */
456
+ google.protobuf.Timestamp expiration_time = 3;
457
+
458
+ /*
459
+ * Application-defined metadata.
460
+ *
461
+ * There is no inherent or required relationship between this
462
+ * and the app_metadata fields in the FlightInfo or resulting
463
+ * FlightData messages. Since this metadata is application-defined,
464
+ * a given application could define there to be a relationship,
465
+ * but there is none required by the spec.
466
+ */
467
+ bytes app_metadata = 4;
468
+ }
469
+
470
+ /*
471
+ * A location where a Flight service will accept retrieval of a particular
472
+ * stream given a ticket.
473
+ */
474
+ message Location {
475
+ string uri = 1;
476
+ }
477
+
478
+ /*
479
+ * An opaque identifier that the service can use to retrieve a particular
480
+ * portion of a stream.
481
+ *
482
+ * Tickets are meant to be single use. It is an error/application-defined
483
+ * behavior to reuse a ticket.
484
+ */
485
+ message Ticket {
486
+ bytes ticket = 1;
487
+ }
488
+
489
+ /*
490
+ * A batch of Arrow data as part of a stream of batches.
491
+ */
492
+ message FlightData {
493
+
494
+ /*
495
+ * The descriptor of the data. This is only relevant when a client is
496
+ * starting a new DoPut stream.
497
+ */
498
+ FlightDescriptor flight_descriptor = 1;
499
+
500
+ /*
501
+ * Header for message data as described in Message.fbs::Message.
502
+ */
503
+ bytes data_header = 2;
504
+
505
+ /*
506
+ * Application-defined metadata.
507
+ */
508
+ bytes app_metadata = 3;
509
+
510
+ /*
511
+ * The actual batch of Arrow data. Preferably handled with minimal-copies
512
+ * coming last in the definition to help with sidecar patterns (it is
513
+ * expected that some implementations will fetch this field off the wire
514
+ * with specialized code to avoid extra memory copies).
515
+ */
516
+ bytes data_body = 1000;
517
+ }
518
+
519
+ /**
520
+ * The response message associated with the submission of a DoPut.
521
+ */
522
+ message PutResult {
523
+ bytes app_metadata = 1;
524
+ }