@backstage/plugin-catalog-backend-module-incremental-ingestion 0.0.0-nightly-20221125023412
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +21 -0
- package/README.md +341 -0
- package/alpha/package.json +6 -0
- package/dist/index.alpha.d.ts +153 -0
- package/dist/index.beta.d.ts +143 -0
- package/dist/index.cjs.js +979 -0
- package/dist/index.cjs.js.map +1 -0
- package/dist/index.d.ts +143 -0
- package/migrations/20221116073152_init.js +177 -0
- package/package.json +67 -0
package/CHANGELOG.md
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
# @backstage/plugin-catalog-backend-module-incremental-ingestion
|
|
2
|
+
|
|
3
|
+
## 0.0.0-nightly-20221125023412
|
|
4
|
+
|
|
5
|
+
### Minor Changes
|
|
6
|
+
|
|
7
|
+
- 98c643a1a2: Introduces incremental entity providers, which are used for streaming very large data sources into the catalog.
|
|
8
|
+
|
|
9
|
+
### Patch Changes
|
|
10
|
+
|
|
11
|
+
- Updated dependencies
|
|
12
|
+
- @backstage/plugin-catalog-backend@0.0.0-nightly-20221125023412
|
|
13
|
+
- @backstage/backend-common@0.0.0-nightly-20221125023412
|
|
14
|
+
- @backstage/backend-test-utils@0.0.0-nightly-20221125023412
|
|
15
|
+
- @backstage/plugin-permission-common@0.0.0-nightly-20221125023412
|
|
16
|
+
- @backstage/backend-plugin-api@0.0.0-nightly-20221125023412
|
|
17
|
+
- @backstage/plugin-catalog-node@0.0.0-nightly-20221125023412
|
|
18
|
+
- @backstage/backend-tasks@0.0.0-nightly-20221125023412
|
|
19
|
+
- @backstage/catalog-model@0.0.0-nightly-20221125023412
|
|
20
|
+
- @backstage/config@0.0.0-nightly-20221125023412
|
|
21
|
+
- @backstage/errors@0.0.0-nightly-20221125023412
|
package/README.md
ADDED
|
@@ -0,0 +1,341 @@
|
|
|
1
|
+
# `@backstage/plugin-catalog-backend-module-incremental-ingestion`
|
|
2
|
+
|
|
3
|
+
The Incremental Ingestion catalog backend module provides an Incremental Entity Provider that can be used to ingest data from sources using delta mutations, while retaining the orphan prevention mechanism provided by full mutations.
|
|
4
|
+
|
|
5
|
+
## Why did we create it?
|
|
6
|
+
|
|
7
|
+
Backstage provides an [Entity Provider mechanism that has two kinds of mutations](https://backstage.io/docs/features/software-catalog/external-integrations#provider-mutations): `delta` and `full`. `delta` mutations tell Backstage Catalog which entities should be added and removed from the catalog. `full` mutation accepts a list of entities and automatically computes which entities must be removed by comparing the provided entities against existing entities to create a diff between the two sets. These two kinds of mutations are convenient for different kinds of data sources. A `delta` mutation can be used with a data source that emits UPDATE and DELETE events for its data. A `full` mutation is useful for APIs that produce fewer entities than can fit in Backstage processes' memory.
|
|
8
|
+
|
|
9
|
+
Unfortunately, these two kinds of mutations are insufficient for very large data sources for the following reasons,
|
|
10
|
+
|
|
11
|
+
1. Even when the API provides DELETE events, we still need a way to create the initial list of entities. For example, if you ingest all repositories from GitHub into Backstage and you use webhooks, you still need the initial list of entities.
|
|
12
|
+
2. A `delta` mutation can not guarantee that mutations will not be missed. For example, if your Backstage portal is down while receiving a DELETE event, you might miss the event which leaves your catalog in an unclear state. How can you replay the missed events? Some data sources, like GitHub, provide an API for replaying missed events, but this increases complexity and is not available on all APIs.
|
|
13
|
+
3. Addressing the above two use case with `full` mutation is not an option on very large datasets because a `full` mutation requires that all entities are in memory to create a diff. If your data source has 100k+ records, this can easily cause your processes to run out of memory.
|
|
14
|
+
4. In cases when you can use `full` mutation, committing many entities into the processing pipeline fills up the processing queue and delays the processing of entities from other entity providers.
|
|
15
|
+
|
|
16
|
+
We created the Incremental Entity Provider to address all of the above issues. The Incremental Entity Provider addresses these issues with a combination of `delta` mutations and a mark-and-sweep mechanism. Instead of doing a single `full` mutation, it performs a series of bursts. At the end of each burst, the Incremental Entity Provider performs the following three operations,
|
|
17
|
+
|
|
18
|
+
1. Marks each received entity in the database.
|
|
19
|
+
2. Annotates each entity with `backstage/incremental-entity-provider: <entity-provider-id>` annotation.
|
|
20
|
+
3. Commits all of the entities with a `delta` mutation.
|
|
21
|
+
|
|
22
|
+
Incremental Entity Providers will wait a configurable interval before proceeding to the next burst.
|
|
23
|
+
|
|
24
|
+
Once the source has no more results, Incremental Entity Provider compares all entities annotated with `@backstage/incremental-entity-provider: <entity-provider-id>` against all marked entities to determine which entities committed by same entity provider were not marked during the last ingestion cycle. All unmarked entities are deleted at the end of the cycle. The Incremental Entity Provider rests for a fixed internal before restarting the ingestion process.
|
|
25
|
+
|
|
26
|
+

|
|
27
|
+
|
|
28
|
+
This approach has the following benefits,
|
|
29
|
+
|
|
30
|
+
1. Reduced ingestion latency - each burst commits entities which are processed before the entire list is processed.
|
|
31
|
+
2. Stable pressure - each period between bursts provides an opportunity for the processing pipeline to settle without overwhelming the pipeline with a large number of unprocessed entities.
|
|
32
|
+
3. Built-in retry / back-off - Failed bursts are automatically retried with a built-in back-off interval providing an opportunity for the data source to reset its rate limits before retrying the burst.
|
|
33
|
+
4. Prevents orphan entities - Deleted entities are removed as with `full` mutation with a low memory footprint.
|
|
34
|
+
|
|
35
|
+
## Requirements
|
|
36
|
+
|
|
37
|
+
The Incremental Entity Provider backend is designed for data sources that provide paginated results. Each burst attempts to handle one or more pages of the query. The plugin will attempt to fetch as many pages as it can within a configurable burst length. At every iteration, it expects to receive the next cursor that will be used to query in the next iteration. Each iteration may happen on a different replica. This has several consequences:
|
|
38
|
+
|
|
39
|
+
1. The cursor must be serializable to JSON (not an issue for most RESTful or GraphQL based APIs).
|
|
40
|
+
2. The client must be stateless - a client is created from scratch for each iteration to allow distributing processing over multiple replicas.
|
|
41
|
+
3. There must be sufficient storage in Postgres to handle the additional data. (Presumably, this is also true of sqlite, but it has only been tested with Postgres.)
|
|
42
|
+
|
|
43
|
+
## Installation
|
|
44
|
+
|
|
45
|
+
1. Install `@backstage/plugin-catalog-backend-module-incremental-ingestion` with `yarn workspace backend add @backstage/plugin-catalog-backend-module-incremental-ingestion`
|
|
46
|
+
2. Import `IncrementalCatalogBuilder` from `@backstage/plugin-catalog-backend-module-incremental-ingestion` and instantiate it with `await IncrementalCatalogBuilder.create(env, builder)`. You have to pass `builder` into `IncrementalCatalogBuilder.create` function because `IncrementalCatalogBuilder` will convert an `IncrementalEntityProvider` into an `EntityProvider` and call `builder.addEntityProvider`.
|
|
47
|
+
|
|
48
|
+
```ts
|
|
49
|
+
const builder = CatalogBuilder.create(env);
|
|
50
|
+
// incremental builder receives builder because it'll register
|
|
51
|
+
// incremental entity providers with the builder
|
|
52
|
+
const incrementalBuilder = await IncrementalCatalogBuilder.create(
|
|
53
|
+
env,
|
|
54
|
+
builder,
|
|
55
|
+
);
|
|
56
|
+
```
|
|
57
|
+
|
|
58
|
+
3. Last step, add `await incrementBuilder.build()` after `await builder.build()` to ensure that all `CatalogBuilder` migration run before running `incrementBuilder.build()` migrations.
|
|
59
|
+
|
|
60
|
+
```ts
|
|
61
|
+
const { processingEngine, router } = await builder.build();
|
|
62
|
+
|
|
63
|
+
// this has to run after `await builder.build()` to ensure that catalog migrations are completed
|
|
64
|
+
// before incremental builder migrations are executed
|
|
65
|
+
await incrementalBuilder.build();
|
|
66
|
+
```
|
|
67
|
+
|
|
68
|
+
The result should look something like this,
|
|
69
|
+
|
|
70
|
+
```ts
|
|
71
|
+
import { CatalogBuilder } from '@backstage/plugin-catalog-backend';
|
|
72
|
+
import { ScaffolderEntitiesProcessor } from '@backstage/plugin-scaffolder-backend';
|
|
73
|
+
import { IncrementalCatalogBuilder } from '@backstage/plugin-catalog-backend-module-incremental-ingestion';
|
|
74
|
+
import { Router } from 'express';
|
|
75
|
+
import { Duration } from 'luxon';
|
|
76
|
+
import { PluginEnvironment } from '../types';
|
|
77
|
+
|
|
78
|
+
export default async function createPlugin(
|
|
79
|
+
env: PluginEnvironment,
|
|
80
|
+
): Promise<Router> {
|
|
81
|
+
const builder = CatalogBuilder.create(env);
|
|
82
|
+
// incremental builder receives builder because it'll register
|
|
83
|
+
// incremental entity providers with the builder
|
|
84
|
+
const incrementalBuilder = await IncrementalCatalogBuilder.create(
|
|
85
|
+
env,
|
|
86
|
+
builder,
|
|
87
|
+
);
|
|
88
|
+
|
|
89
|
+
builder.addProcessor(new ScaffolderEntitiesProcessor());
|
|
90
|
+
|
|
91
|
+
const { processingEngine, router } = await builder.build();
|
|
92
|
+
|
|
93
|
+
// this has to run after `await builder.build()` so ensure that catalog migrations are completed
|
|
94
|
+
// before incremental builder migrations are executed
|
|
95
|
+
const { incrementalAdminRouter } = await incrementalBuilder.build();
|
|
96
|
+
|
|
97
|
+
router.use('/incremental', incrementalAdminRouter);
|
|
98
|
+
|
|
99
|
+
await processingEngine.start();
|
|
100
|
+
|
|
101
|
+
return router;
|
|
102
|
+
}
|
|
103
|
+
```
|
|
104
|
+
|
|
105
|
+
## Writing an Incremental Entity Provider
|
|
106
|
+
|
|
107
|
+
To create an Incremental Entity Provider, you need to know how to retrieve a single page of the data that you wish to ingest into the Backstage catalog. If the API has pagination and you know how to make a paginated request to that API, you'll be able to implement an Incremental Entity Provider for this API. For more information about compatibility, checkout <a href="#compatible-data-source">Compatible data sources</a> section on this page.
|
|
108
|
+
|
|
109
|
+
Here is the type definition for an Incremental Entity Provider.
|
|
110
|
+
|
|
111
|
+
```ts
|
|
112
|
+
interface IncrementalEntityProvider<TCursor, TContext> {
|
|
113
|
+
/**
|
|
114
|
+
* This name must be unique between all of the entity providers
|
|
115
|
+
* operating in the catalog.
|
|
116
|
+
*/
|
|
117
|
+
getProviderName(): string;
|
|
118
|
+
|
|
119
|
+
/**
|
|
120
|
+
* Do any setup and teardown necessary in order to provide the
|
|
121
|
+
* context for fetching pages. This should always invoke `burst` in
|
|
122
|
+
* order to fetch the individual pages.
|
|
123
|
+
*
|
|
124
|
+
* @param burst - a function which performs a series of iterations
|
|
125
|
+
*/
|
|
126
|
+
around(burst: (context: TContext) => Promise<void>): Promise<void>;
|
|
127
|
+
|
|
128
|
+
/**
|
|
129
|
+
* Return a single page of entities from a specific point in the
|
|
130
|
+
* ingestion.
|
|
131
|
+
*
|
|
132
|
+
* @param context - anything needed in order to fetch a single page.
|
|
133
|
+
* @param cursor - a unique value identifying the page to ingest.
|
|
134
|
+
* @returns the entities to be ingested, as well as the cursor of
|
|
135
|
+
* the the next page after this one.
|
|
136
|
+
*/
|
|
137
|
+
next(
|
|
138
|
+
context: TContext,
|
|
139
|
+
cursor?: TCursor,
|
|
140
|
+
): Promise<EntityIteratorResult<TCursor>>;
|
|
141
|
+
}
|
|
142
|
+
```
|
|
143
|
+
|
|
144
|
+
For tutorial, we'll write an Incremental Entity Provider that will call an imaginary API. This imaginary API will return a list of imaginary services. This imaginary API has an imaginary API client with the following interface.
|
|
145
|
+
|
|
146
|
+
```ts
|
|
147
|
+
interface MyApiClient {
|
|
148
|
+
getServices(page: number): MyPaginatedResults<Service>;
|
|
149
|
+
}
|
|
150
|
+
|
|
151
|
+
interface MyPaginatedResults<T> {
|
|
152
|
+
items: T[];
|
|
153
|
+
totalPages: number;
|
|
154
|
+
}
|
|
155
|
+
|
|
156
|
+
interface Service {
|
|
157
|
+
name: string;
|
|
158
|
+
}
|
|
159
|
+
```
|
|
160
|
+
|
|
161
|
+
These are the only 3 methods that you need to implement. `getProviderName()` is pretty self explanatory and it's exactly same as on Entity Provider.
|
|
162
|
+
|
|
163
|
+
```ts
|
|
164
|
+
import {
|
|
165
|
+
IncrementalEntityProvider,
|
|
166
|
+
EntityIteratorResult,
|
|
167
|
+
} from '@backstage/plugin-catalog-backend-module-incremental-ingestion';
|
|
168
|
+
|
|
169
|
+
// this will include your pagination information, let's say our API accepts a `page` parameter.
|
|
170
|
+
// In this case, the cursor will include `page`
|
|
171
|
+
interface MyApiCursor {
|
|
172
|
+
page: number;
|
|
173
|
+
}
|
|
174
|
+
|
|
175
|
+
// This interface describes the type of data that will be passed to your burst function.
|
|
176
|
+
interface MyContext {
|
|
177
|
+
apiClient: MyApiClient;
|
|
178
|
+
}
|
|
179
|
+
|
|
180
|
+
export class MyIncrementalEntityProvider
|
|
181
|
+
implements IncrementalEntityProvider<MyApiCursor, MyContext>
|
|
182
|
+
{
|
|
183
|
+
getProviderName() {
|
|
184
|
+
return `MyIncrementalEntityProvider`;
|
|
185
|
+
}
|
|
186
|
+
}
|
|
187
|
+
```
|
|
188
|
+
|
|
189
|
+
`around` method is used for setup and tear-down. For example, if you need to create a client that will connect to the API, you would do that here.
|
|
190
|
+
|
|
191
|
+
```ts
|
|
192
|
+
export class MyIncrementalEntityProvider
|
|
193
|
+
implements IncrementalEntityProvider<Cursor, Context>
|
|
194
|
+
{
|
|
195
|
+
getProviderName() {
|
|
196
|
+
return `MyIncrementalEntityProvider`;
|
|
197
|
+
}
|
|
198
|
+
|
|
199
|
+
async around(burst: (context: MyContext) => Promise<void>): Promise<void> {
|
|
200
|
+
const apiClient = new MyApiClient();
|
|
201
|
+
|
|
202
|
+
await burst({ apiClient });
|
|
203
|
+
|
|
204
|
+
// if you need to do any teardown, you can do it here
|
|
205
|
+
}
|
|
206
|
+
}
|
|
207
|
+
```
|
|
208
|
+
|
|
209
|
+
If you need to pass a token to your API, then you can create a constructor that will receive a token and use the token to setup the client.
|
|
210
|
+
|
|
211
|
+
```ts
|
|
212
|
+
export class MyIncrementalEntityProvider
|
|
213
|
+
implements IncrementalEntityProvider<Cursor, Context>
|
|
214
|
+
{
|
|
215
|
+
token: string;
|
|
216
|
+
|
|
217
|
+
constructor(token: string) {
|
|
218
|
+
this.token = token;
|
|
219
|
+
}
|
|
220
|
+
|
|
221
|
+
getProviderName() {
|
|
222
|
+
return `MyIncrementalEntityProvider`;
|
|
223
|
+
}
|
|
224
|
+
|
|
225
|
+
async around(burst: (context: MyContext) => Promise<void>): Promise<void> {
|
|
226
|
+
const apiClient = new MyApiClient(this.token);
|
|
227
|
+
|
|
228
|
+
await burst({ apiClient });
|
|
229
|
+
}
|
|
230
|
+
}
|
|
231
|
+
```
|
|
232
|
+
|
|
233
|
+
The last step is to implement the actual `next` method that will accept the cursor, call the API, process the result and return the result.
|
|
234
|
+
|
|
235
|
+
```ts
|
|
236
|
+
export class MyIncrementalEntityProvider implements IncrementalEntityProvider<MyApiCursor, MyContext> {
|
|
237
|
+
|
|
238
|
+
token: string;
|
|
239
|
+
|
|
240
|
+
constructor(token: string) {
|
|
241
|
+
this.token = token;
|
|
242
|
+
}
|
|
243
|
+
|
|
244
|
+
getProviderName() {
|
|
245
|
+
return `MyIncrementalEntityProvider`;
|
|
246
|
+
}
|
|
247
|
+
|
|
248
|
+
|
|
249
|
+
async around(burst: (context: MyContext) => Promise<void>): Promise<void> {
|
|
250
|
+
|
|
251
|
+
const apiClient = new MyApiClient(this.token)
|
|
252
|
+
|
|
253
|
+
await burst({ apiClient })
|
|
254
|
+
}
|
|
255
|
+
|
|
256
|
+
async next(context: MyContext, cursor?: MyApiCursor = { page: 1 }): Promise<EntityIteratorResult<MyApiCursor>> {
|
|
257
|
+
const { apiClient } = context;
|
|
258
|
+
|
|
259
|
+
// call your API with the current cursor
|
|
260
|
+
const data = await apiClient.getServices(cursor);
|
|
261
|
+
|
|
262
|
+
// calculate the next page
|
|
263
|
+
const nextPage = page + 1;
|
|
264
|
+
|
|
265
|
+
// figure out if there are any more pages to fetch
|
|
266
|
+
const done = nextPage > data.totalPages;
|
|
267
|
+
|
|
268
|
+
// convert returned items into entities
|
|
269
|
+
const entities = data.items.map(item => ({
|
|
270
|
+
entity: {
|
|
271
|
+
apiVersion: 'backstage.io/v1beta1',
|
|
272
|
+
kind: 'Component',
|
|
273
|
+
metadata: {
|
|
274
|
+
name: item.name,
|
|
275
|
+
annotations: {
|
|
276
|
+
// You need to define these, otherwise they'll fail validation
|
|
277
|
+
[ANNOTATION_LOCATION]: this.getProviderName(),
|
|
278
|
+
[ANNOTATION_ORIGIN_LOCATION]: this.getProviderName(),
|
|
279
|
+
}
|
|
280
|
+
}
|
|
281
|
+
spec: {
|
|
282
|
+
type: 'service'
|
|
283
|
+
}
|
|
284
|
+
}
|
|
285
|
+
}));
|
|
286
|
+
|
|
287
|
+
// create the next cursor
|
|
288
|
+
const nextCursor = {
|
|
289
|
+
page: nextPage
|
|
290
|
+
};
|
|
291
|
+
|
|
292
|
+
return {
|
|
293
|
+
done,
|
|
294
|
+
entities,
|
|
295
|
+
cursor: nextCursor
|
|
296
|
+
}
|
|
297
|
+
}
|
|
298
|
+
}
|
|
299
|
+
```
|
|
300
|
+
|
|
301
|
+
Now that you have your new Incremental Entity Provider, we can connect it to the catalog.
|
|
302
|
+
|
|
303
|
+
## Adding an Incremental Entity Provider to the catalog
|
|
304
|
+
|
|
305
|
+
We'll assume you followed the <a href="#installation">Installation</a> instructions. After you create your `incrementalBuilder`, you can instantiate your Entity Provider and pass it to the `addIncrementalEntityProvider` method.
|
|
306
|
+
|
|
307
|
+
```ts
|
|
308
|
+
const incrementalBuilder = await IncrementalCatalogBuilder.create(env, builder);
|
|
309
|
+
|
|
310
|
+
// I'm assuming you're going to get your token from config
|
|
311
|
+
const token = config.getString('myApiClient.token');
|
|
312
|
+
|
|
313
|
+
const myEntityProvider = new MyIncrementalEntityProvider(token)
|
|
314
|
+
|
|
315
|
+
incrementalBuilder.addIncrementalEntityProvider(
|
|
316
|
+
myEntityProvider,
|
|
317
|
+
{
|
|
318
|
+
// how long should it attempt to read pages from the API
|
|
319
|
+
// keep this short. Incremental Entity Provider will attempt to
|
|
320
|
+
// read as many pages as it can in this time
|
|
321
|
+
burstLength: Duration.fromObject({ seconds: 3 }),
|
|
322
|
+
// how long should it wait between bursts?
|
|
323
|
+
burstInterval: Duration.fromObject({ seconds: 3 }),
|
|
324
|
+
// how long should it rest before re-ingesting again?
|
|
325
|
+
restLength: Duration.fromObject({ day: 1 })
|
|
326
|
+
// optional back-off configuration - how long should it wait to retry?
|
|
327
|
+
backoff: [
|
|
328
|
+
Duration.fromObject({ seconds: 5 }),
|
|
329
|
+
Duration.fromObject({ seconds: 30 }),
|
|
330
|
+
Duration.fromObject({ minutes: 10 }),
|
|
331
|
+
Duration.fromObject({ hours: 3 })
|
|
332
|
+
]
|
|
333
|
+
}
|
|
334
|
+
)
|
|
335
|
+
```
|
|
336
|
+
|
|
337
|
+
That's it!!!
|
|
338
|
+
|
|
339
|
+
## Error handling
|
|
340
|
+
|
|
341
|
+
If `around` or `next` methods throw an error, the error will show up in logs and it'll trigger the Incremental Entity Provider to try again after a back-off period. It'll keep trying until it reaches the last back-off attempt, at which point it will cancel the current ingestion and start over. You don't need to do anything special to handle the retry logic.
|
|
@@ -0,0 +1,153 @@
|
|
|
1
|
+
/**
|
|
2
|
+
* Provides efficient incremental ingestion of entities into the catalog.
|
|
3
|
+
*
|
|
4
|
+
* @packageDocumentation
|
|
5
|
+
*/
|
|
6
|
+
|
|
7
|
+
/// <reference types="express" />
|
|
8
|
+
|
|
9
|
+
import { BackendFeature } from '@backstage/backend-plugin-api';
|
|
10
|
+
import { CatalogBuilder } from '@backstage/plugin-catalog-backend';
|
|
11
|
+
import type { Config } from '@backstage/config';
|
|
12
|
+
import type { DeferredEntity } from '@backstage/plugin-catalog-backend';
|
|
13
|
+
import type { DurationObjectUnits } from 'luxon';
|
|
14
|
+
import type { Logger } from 'winston';
|
|
15
|
+
import type { PermissionEvaluator } from '@backstage/plugin-permission-common';
|
|
16
|
+
import type { PluginDatabaseManager } from '@backstage/backend-common';
|
|
17
|
+
import type { PluginTaskScheduler } from '@backstage/backend-tasks';
|
|
18
|
+
import { Router } from 'express';
|
|
19
|
+
import type { UrlReader } from '@backstage/backend-common';
|
|
20
|
+
|
|
21
|
+
/**
|
|
22
|
+
* Value returned by an {@link IncrementalEntityProvider} to provide a
|
|
23
|
+
* single page of entities to ingest.
|
|
24
|
+
*
|
|
25
|
+
* @public
|
|
26
|
+
*/
|
|
27
|
+
export declare type EntityIteratorResult<T> = {
|
|
28
|
+
done: false;
|
|
29
|
+
entities: DeferredEntity[];
|
|
30
|
+
cursor: T;
|
|
31
|
+
} | {
|
|
32
|
+
done: true;
|
|
33
|
+
entities?: DeferredEntity[];
|
|
34
|
+
cursor?: T;
|
|
35
|
+
};
|
|
36
|
+
|
|
37
|
+
/**
|
|
38
|
+
* Entity annotation containing the incremental entity provider.
|
|
39
|
+
*
|
|
40
|
+
* @public
|
|
41
|
+
*/
|
|
42
|
+
export declare const INCREMENTAL_ENTITY_PROVIDER_ANNOTATION = "backstage.io/incremental-provider-name";
|
|
43
|
+
|
|
44
|
+
/** @public */
|
|
45
|
+
export declare class IncrementalCatalogBuilder {
|
|
46
|
+
private env;
|
|
47
|
+
private builder;
|
|
48
|
+
private client;
|
|
49
|
+
private manager;
|
|
50
|
+
/**
|
|
51
|
+
* Creates the incremental catalog builder, which extends the regular catalog builder.
|
|
52
|
+
* @param env - PluginEnvironment
|
|
53
|
+
* @param builder - CatalogBuilder
|
|
54
|
+
* @returns IncrementalCatalogBuilder
|
|
55
|
+
*/
|
|
56
|
+
static create(env: PluginEnvironment, builder: CatalogBuilder): Promise<IncrementalCatalogBuilder>;
|
|
57
|
+
private ready;
|
|
58
|
+
private constructor();
|
|
59
|
+
build(): Promise<{
|
|
60
|
+
incrementalAdminRouter: Router;
|
|
61
|
+
}>;
|
|
62
|
+
addIncrementalEntityProvider<TCursor, TContext>(provider: IncrementalEntityProvider<TCursor, TContext>, options: IncrementalEntityProviderOptions): void;
|
|
63
|
+
}
|
|
64
|
+
|
|
65
|
+
/**
|
|
66
|
+
* Ingest entities into the catalog in bite-sized chunks.
|
|
67
|
+
*
|
|
68
|
+
* A Normal `EntityProvider` allows you to introduce entities into the
|
|
69
|
+
* processing pipeline by calling an `applyMutation()` on the full set
|
|
70
|
+
* of entities. However, this is not great when the number of entities
|
|
71
|
+
* that you have to keep track of is extremely large because it
|
|
72
|
+
* entails having all of them in memory at once. An
|
|
73
|
+
* `IncrementalEntityProvider` by contrast allows you to provide
|
|
74
|
+
* batches of entities in sequence so that you never need to have more
|
|
75
|
+
* than a few hundred in memory at a time.
|
|
76
|
+
*
|
|
77
|
+
* @public
|
|
78
|
+
*/
|
|
79
|
+
export declare interface IncrementalEntityProvider<TCursor, TContext> {
|
|
80
|
+
/**
|
|
81
|
+
* This name must be unique between all of the entity providers
|
|
82
|
+
* operating in the catalog.
|
|
83
|
+
*/
|
|
84
|
+
getProviderName(): string;
|
|
85
|
+
/**
|
|
86
|
+
* Return a single page of entities from a specific point in the
|
|
87
|
+
* ingestion.
|
|
88
|
+
*
|
|
89
|
+
* @param context - anything needed in order to fetch a single page.
|
|
90
|
+
* @param cursor - a unique value identifying the page to ingest.
|
|
91
|
+
* @returns The entities to be ingested, as well as the cursor of
|
|
92
|
+
* the next page after this one.
|
|
93
|
+
*/
|
|
94
|
+
next(context: TContext, cursor?: TCursor): Promise<EntityIteratorResult<TCursor>>;
|
|
95
|
+
/**
|
|
96
|
+
* Do any setup and teardown necessary in order to provide the
|
|
97
|
+
* context for fetching pages. This should always invoke `burst` in
|
|
98
|
+
* order to fetch the individual pages.
|
|
99
|
+
*
|
|
100
|
+
* @param burst - a function which performs a series of iterations
|
|
101
|
+
*/
|
|
102
|
+
around(burst: (context: TContext) => Promise<void>): Promise<void>;
|
|
103
|
+
}
|
|
104
|
+
|
|
105
|
+
/** @public */
|
|
106
|
+
export declare interface IncrementalEntityProviderOptions {
|
|
107
|
+
/**
|
|
108
|
+
* Entities are ingested in bursts. This interval determines how
|
|
109
|
+
* much time to wait in between each burst.
|
|
110
|
+
*/
|
|
111
|
+
burstInterval: DurationObjectUnits;
|
|
112
|
+
/**
|
|
113
|
+
* Entities are ingested in bursts. This value determines how long
|
|
114
|
+
* to keep ingesting within each burst.
|
|
115
|
+
*/
|
|
116
|
+
burstLength: DurationObjectUnits;
|
|
117
|
+
/**
|
|
118
|
+
* After a successful ingestion, the incremental entity provider
|
|
119
|
+
* will rest for this period of time before starting to ingest
|
|
120
|
+
* again.
|
|
121
|
+
*/
|
|
122
|
+
restLength: DurationObjectUnits;
|
|
123
|
+
/**
|
|
124
|
+
* In the event of an error during an ingestion burst, the backoff
|
|
125
|
+
* determines how soon it will be retried. E.g.
|
|
126
|
+
* `[{ minutes: 1}, { minutes: 5}, {minutes: 30 }, { hours: 3 }]`
|
|
127
|
+
*/
|
|
128
|
+
backoff?: DurationObjectUnits[];
|
|
129
|
+
}
|
|
130
|
+
|
|
131
|
+
/**
|
|
132
|
+
* Registers the incremental entity provider with the catalog processing extension point.
|
|
133
|
+
*
|
|
134
|
+
* @alpha
|
|
135
|
+
*/
|
|
136
|
+
export declare const incrementalIngestionEntityProviderCatalogModule: (options: {
|
|
137
|
+
providers: {
|
|
138
|
+
provider: IncrementalEntityProvider<unknown, unknown>;
|
|
139
|
+
options: IncrementalEntityProviderOptions;
|
|
140
|
+
}[];
|
|
141
|
+
}) => BackendFeature;
|
|
142
|
+
|
|
143
|
+
/** @public */
|
|
144
|
+
export declare type PluginEnvironment = {
|
|
145
|
+
logger: Logger;
|
|
146
|
+
database: PluginDatabaseManager;
|
|
147
|
+
scheduler: PluginTaskScheduler;
|
|
148
|
+
config: Config;
|
|
149
|
+
reader: UrlReader;
|
|
150
|
+
permissions: PermissionEvaluator;
|
|
151
|
+
};
|
|
152
|
+
|
|
153
|
+
export { }
|
|
@@ -0,0 +1,143 @@
|
|
|
1
|
+
/**
|
|
2
|
+
* Provides efficient incremental ingestion of entities into the catalog.
|
|
3
|
+
*
|
|
4
|
+
* @packageDocumentation
|
|
5
|
+
*/
|
|
6
|
+
|
|
7
|
+
/// <reference types="express" />
|
|
8
|
+
|
|
9
|
+
import { BackendFeature } from '@backstage/backend-plugin-api';
|
|
10
|
+
import { CatalogBuilder } from '@backstage/plugin-catalog-backend';
|
|
11
|
+
import type { Config } from '@backstage/config';
|
|
12
|
+
import type { DeferredEntity } from '@backstage/plugin-catalog-backend';
|
|
13
|
+
import type { DurationObjectUnits } from 'luxon';
|
|
14
|
+
import type { Logger } from 'winston';
|
|
15
|
+
import type { PermissionEvaluator } from '@backstage/plugin-permission-common';
|
|
16
|
+
import type { PluginDatabaseManager } from '@backstage/backend-common';
|
|
17
|
+
import type { PluginTaskScheduler } from '@backstage/backend-tasks';
|
|
18
|
+
import { Router } from 'express';
|
|
19
|
+
import type { UrlReader } from '@backstage/backend-common';
|
|
20
|
+
|
|
21
|
+
/**
|
|
22
|
+
* Value returned by an {@link IncrementalEntityProvider} to provide a
|
|
23
|
+
* single page of entities to ingest.
|
|
24
|
+
*
|
|
25
|
+
* @public
|
|
26
|
+
*/
|
|
27
|
+
export declare type EntityIteratorResult<T> = {
|
|
28
|
+
done: false;
|
|
29
|
+
entities: DeferredEntity[];
|
|
30
|
+
cursor: T;
|
|
31
|
+
} | {
|
|
32
|
+
done: true;
|
|
33
|
+
entities?: DeferredEntity[];
|
|
34
|
+
cursor?: T;
|
|
35
|
+
};
|
|
36
|
+
|
|
37
|
+
/**
|
|
38
|
+
* Entity annotation containing the incremental entity provider.
|
|
39
|
+
*
|
|
40
|
+
* @public
|
|
41
|
+
*/
|
|
42
|
+
export declare const INCREMENTAL_ENTITY_PROVIDER_ANNOTATION = "backstage.io/incremental-provider-name";
|
|
43
|
+
|
|
44
|
+
/** @public */
|
|
45
|
+
export declare class IncrementalCatalogBuilder {
|
|
46
|
+
private env;
|
|
47
|
+
private builder;
|
|
48
|
+
private client;
|
|
49
|
+
private manager;
|
|
50
|
+
/**
|
|
51
|
+
* Creates the incremental catalog builder, which extends the regular catalog builder.
|
|
52
|
+
* @param env - PluginEnvironment
|
|
53
|
+
* @param builder - CatalogBuilder
|
|
54
|
+
* @returns IncrementalCatalogBuilder
|
|
55
|
+
*/
|
|
56
|
+
static create(env: PluginEnvironment, builder: CatalogBuilder): Promise<IncrementalCatalogBuilder>;
|
|
57
|
+
private ready;
|
|
58
|
+
private constructor();
|
|
59
|
+
build(): Promise<{
|
|
60
|
+
incrementalAdminRouter: Router;
|
|
61
|
+
}>;
|
|
62
|
+
addIncrementalEntityProvider<TCursor, TContext>(provider: IncrementalEntityProvider<TCursor, TContext>, options: IncrementalEntityProviderOptions): void;
|
|
63
|
+
}
|
|
64
|
+
|
|
65
|
+
/**
|
|
66
|
+
* Ingest entities into the catalog in bite-sized chunks.
|
|
67
|
+
*
|
|
68
|
+
* A Normal `EntityProvider` allows you to introduce entities into the
|
|
69
|
+
* processing pipeline by calling an `applyMutation()` on the full set
|
|
70
|
+
* of entities. However, this is not great when the number of entities
|
|
71
|
+
* that you have to keep track of is extremely large because it
|
|
72
|
+
* entails having all of them in memory at once. An
|
|
73
|
+
* `IncrementalEntityProvider` by contrast allows you to provide
|
|
74
|
+
* batches of entities in sequence so that you never need to have more
|
|
75
|
+
* than a few hundred in memory at a time.
|
|
76
|
+
*
|
|
77
|
+
* @public
|
|
78
|
+
*/
|
|
79
|
+
export declare interface IncrementalEntityProvider<TCursor, TContext> {
|
|
80
|
+
/**
|
|
81
|
+
* This name must be unique between all of the entity providers
|
|
82
|
+
* operating in the catalog.
|
|
83
|
+
*/
|
|
84
|
+
getProviderName(): string;
|
|
85
|
+
/**
|
|
86
|
+
* Return a single page of entities from a specific point in the
|
|
87
|
+
* ingestion.
|
|
88
|
+
*
|
|
89
|
+
* @param context - anything needed in order to fetch a single page.
|
|
90
|
+
* @param cursor - a unique value identifying the page to ingest.
|
|
91
|
+
* @returns The entities to be ingested, as well as the cursor of
|
|
92
|
+
* the next page after this one.
|
|
93
|
+
*/
|
|
94
|
+
next(context: TContext, cursor?: TCursor): Promise<EntityIteratorResult<TCursor>>;
|
|
95
|
+
/**
|
|
96
|
+
* Do any setup and teardown necessary in order to provide the
|
|
97
|
+
* context for fetching pages. This should always invoke `burst` in
|
|
98
|
+
* order to fetch the individual pages.
|
|
99
|
+
*
|
|
100
|
+
* @param burst - a function which performs a series of iterations
|
|
101
|
+
*/
|
|
102
|
+
around(burst: (context: TContext) => Promise<void>): Promise<void>;
|
|
103
|
+
}
|
|
104
|
+
|
|
105
|
+
/** @public */
|
|
106
|
+
export declare interface IncrementalEntityProviderOptions {
|
|
107
|
+
/**
|
|
108
|
+
* Entities are ingested in bursts. This interval determines how
|
|
109
|
+
* much time to wait in between each burst.
|
|
110
|
+
*/
|
|
111
|
+
burstInterval: DurationObjectUnits;
|
|
112
|
+
/**
|
|
113
|
+
* Entities are ingested in bursts. This value determines how long
|
|
114
|
+
* to keep ingesting within each burst.
|
|
115
|
+
*/
|
|
116
|
+
burstLength: DurationObjectUnits;
|
|
117
|
+
/**
|
|
118
|
+
* After a successful ingestion, the incremental entity provider
|
|
119
|
+
* will rest for this period of time before starting to ingest
|
|
120
|
+
* again.
|
|
121
|
+
*/
|
|
122
|
+
restLength: DurationObjectUnits;
|
|
123
|
+
/**
|
|
124
|
+
* In the event of an error during an ingestion burst, the backoff
|
|
125
|
+
* determines how soon it will be retried. E.g.
|
|
126
|
+
* `[{ minutes: 1}, { minutes: 5}, {minutes: 30 }, { hours: 3 }]`
|
|
127
|
+
*/
|
|
128
|
+
backoff?: DurationObjectUnits[];
|
|
129
|
+
}
|
|
130
|
+
|
|
131
|
+
/* Excluded from this release type: incrementalIngestionEntityProviderCatalogModule */
|
|
132
|
+
|
|
133
|
+
/** @public */
|
|
134
|
+
export declare type PluginEnvironment = {
|
|
135
|
+
logger: Logger;
|
|
136
|
+
database: PluginDatabaseManager;
|
|
137
|
+
scheduler: PluginTaskScheduler;
|
|
138
|
+
config: Config;
|
|
139
|
+
reader: UrlReader;
|
|
140
|
+
permissions: PermissionEvaluator;
|
|
141
|
+
};
|
|
142
|
+
|
|
143
|
+
export { }
|