stac-fastapi-elasticsearch 6.0.0__tar.gz → 6.2.0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- {stac_fastapi_elasticsearch-6.0.0 → stac_fastapi_elasticsearch-6.2.0}/PKG-INFO +91 -14
- {stac_fastapi_elasticsearch-6.0.0 → stac_fastapi_elasticsearch-6.2.0}/README.md +90 -13
- {stac_fastapi_elasticsearch-6.0.0 → stac_fastapi_elasticsearch-6.2.0}/setup.py +2 -2
- {stac_fastapi_elasticsearch-6.0.0 → stac_fastapi_elasticsearch-6.2.0}/stac_fastapi/elasticsearch/app.py +11 -1
- {stac_fastapi_elasticsearch-6.0.0 → stac_fastapi_elasticsearch-6.2.0}/stac_fastapi/elasticsearch/config.py +4 -0
- {stac_fastapi_elasticsearch-6.0.0 → stac_fastapi_elasticsearch-6.2.0}/stac_fastapi/elasticsearch/database_logic.py +103 -46
- {stac_fastapi_elasticsearch-6.0.0 → stac_fastapi_elasticsearch-6.2.0}/stac_fastapi/elasticsearch/version.py +1 -1
- {stac_fastapi_elasticsearch-6.0.0 → stac_fastapi_elasticsearch-6.2.0}/stac_fastapi_elasticsearch.egg-info/PKG-INFO +91 -14
- {stac_fastapi_elasticsearch-6.0.0 → stac_fastapi_elasticsearch-6.2.0}/stac_fastapi_elasticsearch.egg-info/requires.txt +2 -2
- {stac_fastapi_elasticsearch-6.0.0 → stac_fastapi_elasticsearch-6.2.0}/setup.cfg +0 -0
- {stac_fastapi_elasticsearch-6.0.0 → stac_fastapi_elasticsearch-6.2.0}/stac_fastapi/elasticsearch/__init__.py +0 -0
- {stac_fastapi_elasticsearch-6.0.0 → stac_fastapi_elasticsearch-6.2.0}/stac_fastapi_elasticsearch.egg-info/SOURCES.txt +0 -0
- {stac_fastapi_elasticsearch-6.0.0 → stac_fastapi_elasticsearch-6.2.0}/stac_fastapi_elasticsearch.egg-info/dependency_links.txt +0 -0
- {stac_fastapi_elasticsearch-6.0.0 → stac_fastapi_elasticsearch-6.2.0}/stac_fastapi_elasticsearch.egg-info/entry_points.txt +0 -0
- {stac_fastapi_elasticsearch-6.0.0 → stac_fastapi_elasticsearch-6.2.0}/stac_fastapi_elasticsearch.egg-info/not-zip-safe +0 -0
- {stac_fastapi_elasticsearch-6.0.0 → stac_fastapi_elasticsearch-6.2.0}/stac_fastapi_elasticsearch.egg-info/top_level.txt +0 -0
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.1
|
|
2
2
|
Name: stac_fastapi_elasticsearch
|
|
3
|
-
Version: 6.
|
|
3
|
+
Version: 6.2.0
|
|
4
4
|
Summary: An implementation of STAC API based on the FastAPI framework with both Elasticsearch and Opensearch.
|
|
5
5
|
Home-page: https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch
|
|
6
6
|
License: MIT
|
|
@@ -36,7 +36,7 @@ Provides-Extra: server
|
|
|
36
36
|
[](https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch/network/members)
|
|
37
37
|
[](https://pypi.org/project/stac-fastapi-elasticsearch/)
|
|
38
38
|
[](https://github.com/radiantearth/stac-spec/tree/v1.1.0)
|
|
39
|
-
[](https://github.com/stac-utils/stac-fastapi)
|
|
40
40
|
|
|
41
41
|
## Sponsors & Supporters
|
|
42
42
|
|
|
@@ -106,6 +106,7 @@ This project is built on the following technologies: STAC, stac-fastapi, FastAPI
|
|
|
106
106
|
- [Auth](#auth)
|
|
107
107
|
- [Aggregation](#aggregation)
|
|
108
108
|
- [Rate Limiting](#rate-limiting)
|
|
109
|
+
- [Datetime-Based Index Management](#datetime-based-index-management)
|
|
109
110
|
|
|
110
111
|
## Documentation & Resources
|
|
111
112
|
|
|
@@ -226,28 +227,105 @@ You can customize additional settings in your `.env` file:
|
|
|
226
227
|
|------------------------------|--------------------------------------------------------------------------------------|--------------------------|---------------------------------------------------------------------------------------------|
|
|
227
228
|
| `ES_HOST` | Hostname for external Elasticsearch/OpenSearch. | `localhost` | Optional |
|
|
228
229
|
| `ES_PORT` | Port for Elasticsearch/OpenSearch. | `9200` (ES) / `9202` (OS)| Optional |
|
|
229
|
-
| `ES_USE_SSL` | Use SSL for connecting to Elasticsearch/OpenSearch. | `
|
|
230
|
-
| `ES_VERIFY_CERTS` | Verify SSL certificates when connecting. | `
|
|
230
|
+
| `ES_USE_SSL` | Use SSL for connecting to Elasticsearch/OpenSearch. | `true` | Optional |
|
|
231
|
+
| `ES_VERIFY_CERTS` | Verify SSL certificates when connecting. | `true` | Optional |
|
|
232
|
+
| `ES_API_KEY` | API Key for external Elasticsearch/OpenSearch. | N/A | Optional |
|
|
233
|
+
| `ES_TIMEOUT` | Client timeout for Elasticsearch/OpenSearch. | DB client default | Optional |
|
|
231
234
|
| `STAC_FASTAPI_TITLE` | Title of the API in the documentation. | `stac-fastapi-<backend>` | Optional |
|
|
232
235
|
| `STAC_FASTAPI_DESCRIPTION` | Description of the API in the documentation. | N/A | Optional |
|
|
233
236
|
| `STAC_FASTAPI_VERSION` | API version. | `2.1` | Optional |
|
|
234
|
-
| `STAC_FASTAPI_LANDING_PAGE_ID` | Landing page ID
|
|
237
|
+
| `STAC_FASTAPI_LANDING_PAGE_ID` | Landing page ID | `stac-fastapi` | Optional |
|
|
235
238
|
| `APP_HOST` | Server bind address. | `0.0.0.0` | Optional |
|
|
236
|
-
| `APP_PORT` | Server port. | `
|
|
239
|
+
| `APP_PORT` | Server port. | `8000` | Optional |
|
|
237
240
|
| `ENVIRONMENT` | Runtime environment. | `local` | Optional |
|
|
238
241
|
| `WEB_CONCURRENCY` | Number of worker processes. | `10` | Optional |
|
|
239
242
|
| `RELOAD` | Enable auto-reload for development. | `true` | Optional |
|
|
240
243
|
| `STAC_FASTAPI_RATE_LIMIT` | API rate limit per client. | `200/minute` | Optional |
|
|
241
|
-
| `BACKEND` | Tests-related variable | `elasticsearch` or `opensearch` based on the backend | Optional
|
|
242
|
-
| `ELASTICSEARCH_VERSION`
|
|
243
|
-
| `OPENSEARCH_VERSION` | OpenSearch version | `2.11.1` | Optional
|
|
244
|
-
| `ENABLE_DIRECT_RESPONSE`
|
|
245
|
-
| `RAISE_ON_BULK_ERROR`
|
|
246
|
-
| `DATABASE_REFRESH`
|
|
244
|
+
| `BACKEND` | Tests-related variable | `elasticsearch` or `opensearch` based on the backend | Optional |
|
|
245
|
+
| `ELASTICSEARCH_VERSION` | Version of Elasticsearch to use. | `8.11.0` | Optional |
|
|
246
|
+
| `OPENSEARCH_VERSION` | OpenSearch version | `2.11.1` | Optional |
|
|
247
|
+
| `ENABLE_DIRECT_RESPONSE` | Enable direct response for maximum performance (disables all FastAPI dependencies, including authentication, custom status codes, and validation) | `false` | Optional |
|
|
248
|
+
| `RAISE_ON_BULK_ERROR` | Controls whether bulk insert operations raise exceptions on errors. If set to `true`, the operation will stop and raise an exception when an error occurs. If set to `false`, errors will be logged, and the operation will continue. **Note:** STAC Item and ItemCollection validation errors will always raise, regardless of this flag. | `false` | Optional |
|
|
249
|
+
| `DATABASE_REFRESH` | Controls whether database operations refresh the index immediately after changes. If set to `true`, changes will be immediately searchable. If set to `false`, changes may not be immediately visible but can improve performance for bulk operations. If set to `wait_for`, changes will wait for the next refresh cycle to become visible. | `false` | Optional |
|
|
247
250
|
| `ENABLE_TRANSACTIONS_EXTENSIONS` | Enables or disables the Transactions and Bulk Transactions API extensions. If set to `false`, the POST `/collections` route and related transaction endpoints (including bulk transaction operations) will be unavailable in the API. This is useful for deployments where mutating the catalog via the API should be prevented. | `true` | Optional |
|
|
248
251
|
|
|
249
252
|
> [!NOTE]
|
|
250
|
-
> The variables `ES_HOST`, `ES_PORT`, `ES_USE_SSL`, and `
|
|
253
|
+
> The variables `ES_HOST`, `ES_PORT`, `ES_USE_SSL`, `ES_VERIFY_CERTS` and `ES_TIMEOUT` apply to both Elasticsearch and OpenSearch backends, so there is no need to rename the key names to `OS_` even if you're using OpenSearch.
|
|
254
|
+
|
|
255
|
+
## Datetime-Based Index Management
|
|
256
|
+
|
|
257
|
+
### Overview
|
|
258
|
+
|
|
259
|
+
SFEOS supports two indexing strategies for managing STAC items:
|
|
260
|
+
|
|
261
|
+
1. **Simple Indexing** (default) - One index per collection
|
|
262
|
+
2. **Datetime-Based Indexing** - Time-partitioned indexes with automatic management
|
|
263
|
+
|
|
264
|
+
The datetime-based indexing strategy is particularly useful for large temporal datasets. When a user provides a datetime parameter in a query, the system knows exactly which index to search, providing **multiple times faster searches** and significantly **reducing database load**.
|
|
265
|
+
|
|
266
|
+
### When to Use
|
|
267
|
+
|
|
268
|
+
**Recommended for:**
|
|
269
|
+
- Systems with large collections containing millions of items
|
|
270
|
+
- Systems requiring high-performance temporal searching
|
|
271
|
+
|
|
272
|
+
**Pros:**
|
|
273
|
+
- Multiple times faster queries with datetime filter
|
|
274
|
+
- Reduced database load - only relevant indexes are searched
|
|
275
|
+
|
|
276
|
+
**Cons:**
|
|
277
|
+
- Slightly longer item indexing time (automatic index management)
|
|
278
|
+
- Greater management complexity
|
|
279
|
+
|
|
280
|
+
### Configuration
|
|
281
|
+
|
|
282
|
+
#### Enabling Datetime-Based Indexing
|
|
283
|
+
|
|
284
|
+
Enable datetime-based indexing by setting the following environment variable:
|
|
285
|
+
|
|
286
|
+
```bash
|
|
287
|
+
ENABLE_DATETIME_INDEX_FILTERING=true
|
|
288
|
+
```
|
|
289
|
+
|
|
290
|
+
### Related Configuration Variables
|
|
291
|
+
|
|
292
|
+
| Variable | Description | Default | Example |
|
|
293
|
+
|----------|-------------|---------|---------|
|
|
294
|
+
| `ENABLE_DATETIME_INDEX_FILTERING` | Enables time-based index partitioning | `false` | `true` |
|
|
295
|
+
| `DATETIME_INDEX_MAX_SIZE_GB` | Maximum size limit for datetime indexes (GB) - note: add +20% to target size due to ES/OS compression | `25` | `50` |
|
|
296
|
+
| `STAC_ITEMS_INDEX_PREFIX` | Prefix for item indexes | `items_` | `stac_items_` |
|
|
297
|
+
|
|
298
|
+
## How Datetime-Based Indexing Works
|
|
299
|
+
|
|
300
|
+
### Index and Alias Naming Convention
|
|
301
|
+
|
|
302
|
+
The system uses a precise naming convention:
|
|
303
|
+
|
|
304
|
+
**Physical indexes:**
|
|
305
|
+
```
|
|
306
|
+
{ITEMS_INDEX_PREFIX}{collection-id}_{uuid4}
|
|
307
|
+
```
|
|
308
|
+
|
|
309
|
+
**Aliases:**
|
|
310
|
+
```
|
|
311
|
+
{ITEMS_INDEX_PREFIX}{collection-id} # Main collection alias
|
|
312
|
+
{ITEMS_INDEX_PREFIX}{collection-id}_{start-datetime} # Temporal alias
|
|
313
|
+
{ITEMS_INDEX_PREFIX}{collection-id}_{start-datetime}_{end-datetime} # Closed index alias
|
|
314
|
+
```
|
|
315
|
+
|
|
316
|
+
**Example:**
|
|
317
|
+
|
|
318
|
+
*Physical indexes:*
|
|
319
|
+
- `items_sentinel-2-l2a_a1b2c3d4-e5f6-7890-abcd-ef1234567890`
|
|
320
|
+
|
|
321
|
+
*Aliases:*
|
|
322
|
+
- `items_sentinel-2-l2a` - main collection alias
|
|
323
|
+
- `items_sentinel-2-l2a_2024-01-01` - active alias from January 1, 2024
|
|
324
|
+
- `items_sentinel-2-l2a_2024-01-01_2024-03-15` - closed index alias (reached size limit)
|
|
325
|
+
|
|
326
|
+
### Index Size Management
|
|
327
|
+
|
|
328
|
+
**Important - Data Compression:** Elasticsearch and OpenSearch automatically compress data. The configured `DATETIME_INDEX_MAX_SIZE_GB` limit refers to the compressed size on disk. It is recommended to add +20% to the target size to account for compression overhead and metadata.
|
|
251
329
|
|
|
252
330
|
## Interacting with the API
|
|
253
331
|
|
|
@@ -557,4 +635,3 @@ You can customize additional settings in your `.env` file:
|
|
|
557
635
|
- Ensures fair resource allocation among all clients
|
|
558
636
|
|
|
559
637
|
- **Examples**: Implementation examples are available in the [examples/rate_limit](examples/rate_limit) directory.
|
|
560
|
-
|
|
@@ -15,7 +15,7 @@
|
|
|
15
15
|
[](https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch/network/members)
|
|
16
16
|
[](https://pypi.org/project/stac-fastapi-elasticsearch/)
|
|
17
17
|
[](https://github.com/radiantearth/stac-spec/tree/v1.1.0)
|
|
18
|
-
[](https://github.com/stac-utils/stac-fastapi)
|
|
19
19
|
|
|
20
20
|
## Sponsors & Supporters
|
|
21
21
|
|
|
@@ -85,6 +85,7 @@ This project is built on the following technologies: STAC, stac-fastapi, FastAPI
|
|
|
85
85
|
- [Auth](#auth)
|
|
86
86
|
- [Aggregation](#aggregation)
|
|
87
87
|
- [Rate Limiting](#rate-limiting)
|
|
88
|
+
- [Datetime-Based Index Management](#datetime-based-index-management)
|
|
88
89
|
|
|
89
90
|
## Documentation & Resources
|
|
90
91
|
|
|
@@ -205,28 +206,105 @@ You can customize additional settings in your `.env` file:
|
|
|
205
206
|
|------------------------------|--------------------------------------------------------------------------------------|--------------------------|---------------------------------------------------------------------------------------------|
|
|
206
207
|
| `ES_HOST` | Hostname for external Elasticsearch/OpenSearch. | `localhost` | Optional |
|
|
207
208
|
| `ES_PORT` | Port for Elasticsearch/OpenSearch. | `9200` (ES) / `9202` (OS)| Optional |
|
|
208
|
-
| `ES_USE_SSL` | Use SSL for connecting to Elasticsearch/OpenSearch. | `
|
|
209
|
-
| `ES_VERIFY_CERTS` | Verify SSL certificates when connecting. | `
|
|
209
|
+
| `ES_USE_SSL` | Use SSL for connecting to Elasticsearch/OpenSearch. | `true` | Optional |
|
|
210
|
+
| `ES_VERIFY_CERTS` | Verify SSL certificates when connecting. | `true` | Optional |
|
|
211
|
+
| `ES_API_KEY` | API Key for external Elasticsearch/OpenSearch. | N/A | Optional |
|
|
212
|
+
| `ES_TIMEOUT` | Client timeout for Elasticsearch/OpenSearch. | DB client default | Optional |
|
|
210
213
|
| `STAC_FASTAPI_TITLE` | Title of the API in the documentation. | `stac-fastapi-<backend>` | Optional |
|
|
211
214
|
| `STAC_FASTAPI_DESCRIPTION` | Description of the API in the documentation. | N/A | Optional |
|
|
212
215
|
| `STAC_FASTAPI_VERSION` | API version. | `2.1` | Optional |
|
|
213
|
-
| `STAC_FASTAPI_LANDING_PAGE_ID` | Landing page ID
|
|
216
|
+
| `STAC_FASTAPI_LANDING_PAGE_ID` | Landing page ID | `stac-fastapi` | Optional |
|
|
214
217
|
| `APP_HOST` | Server bind address. | `0.0.0.0` | Optional |
|
|
215
|
-
| `APP_PORT` | Server port. | `
|
|
218
|
+
| `APP_PORT` | Server port. | `8000` | Optional |
|
|
216
219
|
| `ENVIRONMENT` | Runtime environment. | `local` | Optional |
|
|
217
220
|
| `WEB_CONCURRENCY` | Number of worker processes. | `10` | Optional |
|
|
218
221
|
| `RELOAD` | Enable auto-reload for development. | `true` | Optional |
|
|
219
222
|
| `STAC_FASTAPI_RATE_LIMIT` | API rate limit per client. | `200/minute` | Optional |
|
|
220
|
-
| `BACKEND` | Tests-related variable | `elasticsearch` or `opensearch` based on the backend | Optional
|
|
221
|
-
| `ELASTICSEARCH_VERSION`
|
|
222
|
-
| `OPENSEARCH_VERSION` | OpenSearch version | `2.11.1` | Optional
|
|
223
|
-
| `ENABLE_DIRECT_RESPONSE`
|
|
224
|
-
| `RAISE_ON_BULK_ERROR`
|
|
225
|
-
| `DATABASE_REFRESH`
|
|
223
|
+
| `BACKEND` | Tests-related variable | `elasticsearch` or `opensearch` based on the backend | Optional |
|
|
224
|
+
| `ELASTICSEARCH_VERSION` | Version of Elasticsearch to use. | `8.11.0` | Optional |
|
|
225
|
+
| `OPENSEARCH_VERSION` | OpenSearch version | `2.11.1` | Optional |
|
|
226
|
+
| `ENABLE_DIRECT_RESPONSE` | Enable direct response for maximum performance (disables all FastAPI dependencies, including authentication, custom status codes, and validation) | `false` | Optional |
|
|
227
|
+
| `RAISE_ON_BULK_ERROR` | Controls whether bulk insert operations raise exceptions on errors. If set to `true`, the operation will stop and raise an exception when an error occurs. If set to `false`, errors will be logged, and the operation will continue. **Note:** STAC Item and ItemCollection validation errors will always raise, regardless of this flag. | `false` | Optional |
|
|
228
|
+
| `DATABASE_REFRESH` | Controls whether database operations refresh the index immediately after changes. If set to `true`, changes will be immediately searchable. If set to `false`, changes may not be immediately visible but can improve performance for bulk operations. If set to `wait_for`, changes will wait for the next refresh cycle to become visible. | `false` | Optional |
|
|
226
229
|
| `ENABLE_TRANSACTIONS_EXTENSIONS` | Enables or disables the Transactions and Bulk Transactions API extensions. If set to `false`, the POST `/collections` route and related transaction endpoints (including bulk transaction operations) will be unavailable in the API. This is useful for deployments where mutating the catalog via the API should be prevented. | `true` | Optional |
|
|
227
230
|
|
|
228
231
|
> [!NOTE]
|
|
229
|
-
> The variables `ES_HOST`, `ES_PORT`, `ES_USE_SSL`, and `
|
|
232
|
+
> The variables `ES_HOST`, `ES_PORT`, `ES_USE_SSL`, `ES_VERIFY_CERTS` and `ES_TIMEOUT` apply to both Elasticsearch and OpenSearch backends, so there is no need to rename the key names to `OS_` even if you're using OpenSearch.
|
|
233
|
+
|
|
234
|
+
## Datetime-Based Index Management
|
|
235
|
+
|
|
236
|
+
### Overview
|
|
237
|
+
|
|
238
|
+
SFEOS supports two indexing strategies for managing STAC items:
|
|
239
|
+
|
|
240
|
+
1. **Simple Indexing** (default) - One index per collection
|
|
241
|
+
2. **Datetime-Based Indexing** - Time-partitioned indexes with automatic management
|
|
242
|
+
|
|
243
|
+
The datetime-based indexing strategy is particularly useful for large temporal datasets. When a user provides a datetime parameter in a query, the system knows exactly which index to search, providing **multiple times faster searches** and significantly **reducing database load**.
|
|
244
|
+
|
|
245
|
+
### When to Use
|
|
246
|
+
|
|
247
|
+
**Recommended for:**
|
|
248
|
+
- Systems with large collections containing millions of items
|
|
249
|
+
- Systems requiring high-performance temporal searching
|
|
250
|
+
|
|
251
|
+
**Pros:**
|
|
252
|
+
- Multiple times faster queries with datetime filter
|
|
253
|
+
- Reduced database load - only relevant indexes are searched
|
|
254
|
+
|
|
255
|
+
**Cons:**
|
|
256
|
+
- Slightly longer item indexing time (automatic index management)
|
|
257
|
+
- Greater management complexity
|
|
258
|
+
|
|
259
|
+
### Configuration
|
|
260
|
+
|
|
261
|
+
#### Enabling Datetime-Based Indexing
|
|
262
|
+
|
|
263
|
+
Enable datetime-based indexing by setting the following environment variable:
|
|
264
|
+
|
|
265
|
+
```bash
|
|
266
|
+
ENABLE_DATETIME_INDEX_FILTERING=true
|
|
267
|
+
```
|
|
268
|
+
|
|
269
|
+
### Related Configuration Variables
|
|
270
|
+
|
|
271
|
+
| Variable | Description | Default | Example |
|
|
272
|
+
|----------|-------------|---------|---------|
|
|
273
|
+
| `ENABLE_DATETIME_INDEX_FILTERING` | Enables time-based index partitioning | `false` | `true` |
|
|
274
|
+
| `DATETIME_INDEX_MAX_SIZE_GB` | Maximum size limit for datetime indexes (GB) - note: add +20% to target size due to ES/OS compression | `25` | `50` |
|
|
275
|
+
| `STAC_ITEMS_INDEX_PREFIX` | Prefix for item indexes | `items_` | `stac_items_` |
|
|
276
|
+
|
|
277
|
+
## How Datetime-Based Indexing Works
|
|
278
|
+
|
|
279
|
+
### Index and Alias Naming Convention
|
|
280
|
+
|
|
281
|
+
The system uses a precise naming convention:
|
|
282
|
+
|
|
283
|
+
**Physical indexes:**
|
|
284
|
+
```
|
|
285
|
+
{ITEMS_INDEX_PREFIX}{collection-id}_{uuid4}
|
|
286
|
+
```
|
|
287
|
+
|
|
288
|
+
**Aliases:**
|
|
289
|
+
```
|
|
290
|
+
{ITEMS_INDEX_PREFIX}{collection-id} # Main collection alias
|
|
291
|
+
{ITEMS_INDEX_PREFIX}{collection-id}_{start-datetime} # Temporal alias
|
|
292
|
+
{ITEMS_INDEX_PREFIX}{collection-id}_{start-datetime}_{end-datetime} # Closed index alias
|
|
293
|
+
```
|
|
294
|
+
|
|
295
|
+
**Example:**
|
|
296
|
+
|
|
297
|
+
*Physical indexes:*
|
|
298
|
+
- `items_sentinel-2-l2a_a1b2c3d4-e5f6-7890-abcd-ef1234567890`
|
|
299
|
+
|
|
300
|
+
*Aliases:*
|
|
301
|
+
- `items_sentinel-2-l2a` - main collection alias
|
|
302
|
+
- `items_sentinel-2-l2a_2024-01-01` - active alias from January 1, 2024
|
|
303
|
+
- `items_sentinel-2-l2a_2024-01-01_2024-03-15` - closed index alias (reached size limit)
|
|
304
|
+
|
|
305
|
+
### Index Size Management
|
|
306
|
+
|
|
307
|
+
**Important - Data Compression:** Elasticsearch and OpenSearch automatically compress data. The configured `DATETIME_INDEX_MAX_SIZE_GB` limit refers to the compressed size on disk. It is recommended to add +20% to the target size to account for compression overhead and metadata.
|
|
230
308
|
|
|
231
309
|
## Interacting with the API
|
|
232
310
|
|
|
@@ -536,4 +614,3 @@ You can customize additional settings in your `.env` file:
|
|
|
536
614
|
- Ensures fair resource allocation among all clients
|
|
537
615
|
|
|
538
616
|
- **Examples**: Implementation examples are available in the [examples/rate_limit](examples/rate_limit) directory.
|
|
539
|
-
|
|
@@ -6,8 +6,8 @@ with open("README.md") as f:
|
|
|
6
6
|
desc = f.read()
|
|
7
7
|
|
|
8
8
|
install_requires = [
|
|
9
|
-
"stac-fastapi-core==6.
|
|
10
|
-
"sfeos-helpers==6.
|
|
9
|
+
"stac-fastapi-core==6.2.0",
|
|
10
|
+
"sfeos-helpers==6.2.0",
|
|
11
11
|
"elasticsearch[async]~=8.18.0",
|
|
12
12
|
"uvicorn~=0.23.0",
|
|
13
13
|
"starlette>=0.35.0,<0.36.0",
|
|
@@ -31,6 +31,7 @@ from stac_fastapi.elasticsearch.database_logic import (
|
|
|
31
31
|
)
|
|
32
32
|
from stac_fastapi.extensions.core import (
|
|
33
33
|
AggregationExtension,
|
|
34
|
+
CollectionSearchExtension,
|
|
34
35
|
FilterExtension,
|
|
35
36
|
FreeTextExtension,
|
|
36
37
|
SortExtension,
|
|
@@ -60,6 +61,14 @@ filter_extension.conformance_classes.append(
|
|
|
60
61
|
FilterConformanceClasses.ADVANCED_COMPARISON_OPERATORS
|
|
61
62
|
)
|
|
62
63
|
|
|
64
|
+
# Adding collection search extension for compatibility with stac-auth-proxy
|
|
65
|
+
# (https://github.com/developmentseed/stac-auth-proxy)
|
|
66
|
+
# The extension is not fully implemented yet but is required for collection filtering support
|
|
67
|
+
collection_search_extension = CollectionSearchExtension()
|
|
68
|
+
collection_search_extension.conformance_classes.append(
|
|
69
|
+
"https://api.stacspec.org/v1.0.0-rc.1/collection-search#filter"
|
|
70
|
+
)
|
|
71
|
+
|
|
63
72
|
aggregation_extension = AggregationExtension(
|
|
64
73
|
client=EsAsyncBaseAggregationClient(
|
|
65
74
|
database=database_logic, session=session, settings=settings
|
|
@@ -75,6 +84,7 @@ search_extensions = [
|
|
|
75
84
|
TokenPaginationExtension(),
|
|
76
85
|
filter_extension,
|
|
77
86
|
FreeTextExtension(),
|
|
87
|
+
collection_search_extension,
|
|
78
88
|
]
|
|
79
89
|
|
|
80
90
|
if TRANSACTIONS_EXTENSIONS:
|
|
@@ -107,7 +117,7 @@ post_request_model = create_post_request_model(search_extensions)
|
|
|
107
117
|
app_config = {
|
|
108
118
|
"title": os.getenv("STAC_FASTAPI_TITLE", "stac-fastapi-elasticsearch"),
|
|
109
119
|
"description": os.getenv("STAC_FASTAPI_DESCRIPTION", "stac-fastapi-elasticsearch"),
|
|
110
|
-
"api_version": os.getenv("STAC_FASTAPI_VERSION", "6.
|
|
120
|
+
"api_version": os.getenv("STAC_FASTAPI_VERSION", "6.2.0"),
|
|
111
121
|
"settings": settings,
|
|
112
122
|
"extensions": extensions,
|
|
113
123
|
"client": CoreClient(
|
|
@@ -56,6 +56,10 @@ def _es_config() -> Dict[str, Any]:
|
|
|
56
56
|
if (u := os.getenv("ES_USER")) and (p := os.getenv("ES_PASS")):
|
|
57
57
|
config["http_auth"] = (u, p)
|
|
58
58
|
|
|
59
|
+
# Include timeout setting if set
|
|
60
|
+
if request_timeout := os.getenv("ES_TIMEOUT"):
|
|
61
|
+
config["request_timeout"] = request_timeout
|
|
62
|
+
|
|
59
63
|
# Explicitly exclude SSL settings when not using SSL
|
|
60
64
|
if not use_ssl:
|
|
61
65
|
return config
|
|
@@ -4,7 +4,7 @@ import asyncio
|
|
|
4
4
|
import logging
|
|
5
5
|
from base64 import urlsafe_b64decode, urlsafe_b64encode
|
|
6
6
|
from copy import deepcopy
|
|
7
|
-
from typing import Any, Dict, Iterable, List, Optional, Tuple, Type
|
|
7
|
+
from typing import Any, Dict, Iterable, List, Optional, Tuple, Type
|
|
8
8
|
|
|
9
9
|
import attr
|
|
10
10
|
import elasticsearch.helpers as helpers
|
|
@@ -27,7 +27,7 @@ from stac_fastapi.extensions.core.transaction.request import (
|
|
|
27
27
|
PartialItem,
|
|
28
28
|
PatchOperation,
|
|
29
29
|
)
|
|
30
|
-
from stac_fastapi.sfeos_helpers import filter
|
|
30
|
+
from stac_fastapi.sfeos_helpers import filter as filter_module
|
|
31
31
|
from stac_fastapi.sfeos_helpers.database import (
|
|
32
32
|
apply_free_text_filter_shared,
|
|
33
33
|
apply_intersects_filter_shared,
|
|
@@ -36,13 +36,16 @@ from stac_fastapi.sfeos_helpers.database import (
|
|
|
36
36
|
get_queryables_mapping_shared,
|
|
37
37
|
index_alias_by_collection_id,
|
|
38
38
|
index_by_collection_id,
|
|
39
|
-
indices,
|
|
40
39
|
mk_actions,
|
|
41
40
|
mk_item_id,
|
|
42
41
|
populate_sort_shared,
|
|
43
42
|
return_date,
|
|
44
43
|
validate_refresh,
|
|
45
44
|
)
|
|
45
|
+
from stac_fastapi.sfeos_helpers.database.query import (
|
|
46
|
+
ES_MAX_URL_LENGTH,
|
|
47
|
+
add_collections_to_body,
|
|
48
|
+
)
|
|
46
49
|
from stac_fastapi.sfeos_helpers.database.utils import (
|
|
47
50
|
merge_to_operations,
|
|
48
51
|
operations_to_script,
|
|
@@ -55,9 +58,14 @@ from stac_fastapi.sfeos_helpers.mappings import (
|
|
|
55
58
|
ITEMS_INDEX_PREFIX,
|
|
56
59
|
Geometry,
|
|
57
60
|
)
|
|
61
|
+
from stac_fastapi.sfeos_helpers.search_engine import (
|
|
62
|
+
BaseIndexInserter,
|
|
63
|
+
BaseIndexSelector,
|
|
64
|
+
IndexInsertionFactory,
|
|
65
|
+
IndexSelectorFactory,
|
|
66
|
+
)
|
|
58
67
|
from stac_fastapi.types.errors import ConflictError, NotFoundError
|
|
59
68
|
from stac_fastapi.types.links import resolve_links
|
|
60
|
-
from stac_fastapi.types.rfc3339 import DateTimeType
|
|
61
69
|
from stac_fastapi.types.stac import Collection, Item
|
|
62
70
|
|
|
63
71
|
logger = logging.getLogger(__name__)
|
|
@@ -135,6 +143,8 @@ class DatabaseLogic(BaseDatabaseLogic):
|
|
|
135
143
|
sync_settings: SyncElasticsearchSettings = attr.ib(
|
|
136
144
|
factory=SyncElasticsearchSettings
|
|
137
145
|
)
|
|
146
|
+
async_index_selector: BaseIndexSelector = attr.ib(init=False)
|
|
147
|
+
async_index_inserter: BaseIndexInserter = attr.ib(init=False)
|
|
138
148
|
|
|
139
149
|
client = attr.ib(init=False)
|
|
140
150
|
sync_client = attr.ib(init=False)
|
|
@@ -143,6 +153,10 @@ class DatabaseLogic(BaseDatabaseLogic):
|
|
|
143
153
|
"""Initialize clients after the class is instantiated."""
|
|
144
154
|
self.client = self.async_settings.create_client
|
|
145
155
|
self.sync_client = self.sync_settings.create_client
|
|
156
|
+
self.async_index_inserter = IndexInsertionFactory.create_insertion_strategy(
|
|
157
|
+
self.client
|
|
158
|
+
)
|
|
159
|
+
self.async_index_selector = IndexSelectorFactory.create_selector(self.client)
|
|
146
160
|
|
|
147
161
|
item_serializer: Type[ItemSerializer] = attr.ib(default=ItemSerializer)
|
|
148
162
|
collection_serializer: Type[CollectionSerializer] = attr.ib(
|
|
@@ -212,15 +226,23 @@ class DatabaseLogic(BaseDatabaseLogic):
|
|
|
212
226
|
with the index for the Collection as the target index and the combined `mk_item_id` as the document id.
|
|
213
227
|
"""
|
|
214
228
|
try:
|
|
215
|
-
|
|
229
|
+
response = await self.client.search(
|
|
216
230
|
index=index_alias_by_collection_id(collection_id),
|
|
217
|
-
|
|
231
|
+
body={
|
|
232
|
+
"query": {"term": {"_id": mk_item_id(item_id, collection_id)}},
|
|
233
|
+
"size": 1,
|
|
234
|
+
},
|
|
218
235
|
)
|
|
236
|
+
if response["hits"]["total"]["value"] == 0:
|
|
237
|
+
raise NotFoundError(
|
|
238
|
+
f"Item {item_id} does not exist inside Collection {collection_id}"
|
|
239
|
+
)
|
|
240
|
+
|
|
241
|
+
return response["hits"]["hits"][0]["_source"]
|
|
219
242
|
except ESNotFoundError:
|
|
220
243
|
raise NotFoundError(
|
|
221
244
|
f"Item {item_id} does not exist inside Collection {collection_id}"
|
|
222
245
|
)
|
|
223
|
-
return item["_source"]
|
|
224
246
|
|
|
225
247
|
async def get_queryables_mapping(self, collection_id: str = "*") -> dict:
|
|
226
248
|
"""Retrieve mapping of Queryables for search.
|
|
@@ -256,31 +278,21 @@ class DatabaseLogic(BaseDatabaseLogic):
|
|
|
256
278
|
|
|
257
279
|
@staticmethod
|
|
258
280
|
def apply_datetime_filter(
|
|
259
|
-
search: Search,
|
|
260
|
-
) -> Search:
|
|
281
|
+
search: Search, datetime: Optional[str]
|
|
282
|
+
) -> Tuple[Search, Dict[str, Optional[str]]]:
|
|
261
283
|
"""Apply a filter to search on datetime, start_datetime, and end_datetime fields.
|
|
262
284
|
|
|
263
285
|
Args:
|
|
264
286
|
search: The search object to filter.
|
|
265
|
-
|
|
266
|
-
- A single datetime string (e.g., "2023-01-01T12:00:00")
|
|
267
|
-
- A datetime range string (e.g., "2023-01-01/2023-12-31")
|
|
268
|
-
- A datetime object
|
|
269
|
-
- A tuple of (start_datetime, end_datetime)
|
|
287
|
+
datetime: Optional[str]
|
|
270
288
|
|
|
271
289
|
Returns:
|
|
272
290
|
The filtered search object.
|
|
273
291
|
"""
|
|
274
|
-
|
|
275
|
-
return search
|
|
292
|
+
datetime_search = return_date(datetime)
|
|
276
293
|
|
|
277
|
-
|
|
278
|
-
|
|
279
|
-
datetime_search = return_date(interval)
|
|
280
|
-
except (ValueError, TypeError) as e:
|
|
281
|
-
# Handle invalid interval formats if return_date fails
|
|
282
|
-
logger.error(f"Invalid interval format: {interval}, error: {e}")
|
|
283
|
-
return search
|
|
294
|
+
if not datetime_search:
|
|
295
|
+
return search, datetime_search
|
|
284
296
|
|
|
285
297
|
if "eq" in datetime_search:
|
|
286
298
|
# For exact matches, include:
|
|
@@ -347,7 +359,10 @@ class DatabaseLogic(BaseDatabaseLogic):
|
|
|
347
359
|
),
|
|
348
360
|
]
|
|
349
361
|
|
|
350
|
-
return
|
|
362
|
+
return (
|
|
363
|
+
search.query(Q("bool", should=should, minimum_should_match=1)),
|
|
364
|
+
datetime_search,
|
|
365
|
+
)
|
|
351
366
|
|
|
352
367
|
@staticmethod
|
|
353
368
|
def apply_bbox_filter(search: Search, bbox: List):
|
|
@@ -462,7 +477,7 @@ class DatabaseLogic(BaseDatabaseLogic):
|
|
|
462
477
|
otherwise the original Search object.
|
|
463
478
|
"""
|
|
464
479
|
if _filter is not None:
|
|
465
|
-
es_query =
|
|
480
|
+
es_query = filter_module.to_es(await self.get_queryables_mapping(), _filter)
|
|
466
481
|
search = search.query(es_query)
|
|
467
482
|
|
|
468
483
|
return search
|
|
@@ -489,6 +504,7 @@ class DatabaseLogic(BaseDatabaseLogic):
|
|
|
489
504
|
token: Optional[str],
|
|
490
505
|
sort: Optional[Dict[str, Dict[str, str]]],
|
|
491
506
|
collection_ids: Optional[List[str]],
|
|
507
|
+
datetime_search: Dict[str, Optional[str]],
|
|
492
508
|
ignore_unavailable: bool = True,
|
|
493
509
|
) -> Tuple[Iterable[Dict[str, Any]], Optional[int], Optional[str]]:
|
|
494
510
|
"""Execute a search query with limit and other optional parameters.
|
|
@@ -499,6 +515,7 @@ class DatabaseLogic(BaseDatabaseLogic):
|
|
|
499
515
|
token (Optional[str]): The token used to return the next set of results.
|
|
500
516
|
sort (Optional[Dict[str, Dict[str, str]]]): Specifies how the results should be sorted.
|
|
501
517
|
collection_ids (Optional[List[str]]): The collection ids to search.
|
|
518
|
+
datetime_search (Dict[str, Optional[str]]): Datetime range used for index selection.
|
|
502
519
|
ignore_unavailable (bool, optional): Whether to ignore unavailable collections. Defaults to True.
|
|
503
520
|
|
|
504
521
|
Returns:
|
|
@@ -519,7 +536,12 @@ class DatabaseLogic(BaseDatabaseLogic):
|
|
|
519
536
|
|
|
520
537
|
query = search.query.to_dict() if search.query else None
|
|
521
538
|
|
|
522
|
-
index_param =
|
|
539
|
+
index_param = await self.async_index_selector.select_indexes(
|
|
540
|
+
collection_ids, datetime_search
|
|
541
|
+
)
|
|
542
|
+
if len(index_param) > ES_MAX_URL_LENGTH - 300:
|
|
543
|
+
index_param = ITEM_INDICES
|
|
544
|
+
query = add_collections_to_body(collection_ids, query)
|
|
523
545
|
|
|
524
546
|
max_result_window = MAX_LIMIT
|
|
525
547
|
|
|
@@ -583,6 +605,7 @@ class DatabaseLogic(BaseDatabaseLogic):
|
|
|
583
605
|
geometry_geohash_grid_precision: int,
|
|
584
606
|
geometry_geotile_grid_precision: int,
|
|
585
607
|
datetime_frequency_interval: str,
|
|
608
|
+
datetime_search,
|
|
586
609
|
ignore_unavailable: Optional[bool] = True,
|
|
587
610
|
):
|
|
588
611
|
"""Return aggregations of STAC Items."""
|
|
@@ -618,7 +641,10 @@ class DatabaseLogic(BaseDatabaseLogic):
|
|
|
618
641
|
if k in aggregations
|
|
619
642
|
}
|
|
620
643
|
|
|
621
|
-
index_param =
|
|
644
|
+
index_param = await self.async_index_selector.select_indexes(
|
|
645
|
+
collection_ids, datetime_search
|
|
646
|
+
)
|
|
647
|
+
|
|
622
648
|
search_task = asyncio.create_task(
|
|
623
649
|
self.client.search(
|
|
624
650
|
index=index_param,
|
|
@@ -660,14 +686,21 @@ class DatabaseLogic(BaseDatabaseLogic):
|
|
|
660
686
|
|
|
661
687
|
"""
|
|
662
688
|
await self.check_collection_exists(collection_id=item["collection"])
|
|
689
|
+
alias = index_alias_by_collection_id(item["collection"])
|
|
690
|
+
doc_id = mk_item_id(item["id"], item["collection"])
|
|
663
691
|
|
|
664
|
-
if not exist_ok
|
|
665
|
-
|
|
666
|
-
|
|
667
|
-
|
|
668
|
-
|
|
669
|
-
|
|
670
|
-
|
|
692
|
+
if not exist_ok:
|
|
693
|
+
alias_exists = await self.client.indices.exists_alias(name=alias)
|
|
694
|
+
|
|
695
|
+
if alias_exists:
|
|
696
|
+
alias_info = await self.client.indices.get_alias(name=alias)
|
|
697
|
+
indices = list(alias_info.keys())
|
|
698
|
+
|
|
699
|
+
for index in indices:
|
|
700
|
+
if await self.client.exists(index=index, id=doc_id):
|
|
701
|
+
raise ConflictError(
|
|
702
|
+
f"Item {item['id']} in collection {item['collection']} already exists"
|
|
703
|
+
)
|
|
671
704
|
|
|
672
705
|
return self.item_serializer.stac_to_db(item, base_url)
|
|
673
706
|
|
|
@@ -798,7 +831,6 @@ class DatabaseLogic(BaseDatabaseLogic):
|
|
|
798
831
|
# Extract item and collection IDs
|
|
799
832
|
item_id = item["id"]
|
|
800
833
|
collection_id = item["collection"]
|
|
801
|
-
|
|
802
834
|
# Ensure kwargs is a dictionary
|
|
803
835
|
kwargs = kwargs or {}
|
|
804
836
|
|
|
@@ -816,9 +848,12 @@ class DatabaseLogic(BaseDatabaseLogic):
|
|
|
816
848
|
item=item, base_url=base_url, exist_ok=exist_ok
|
|
817
849
|
)
|
|
818
850
|
|
|
851
|
+
target_index = await self.async_index_inserter.get_target_index(
|
|
852
|
+
collection_id, item
|
|
853
|
+
)
|
|
819
854
|
# Index the item in the database
|
|
820
855
|
await self.client.index(
|
|
821
|
-
index=
|
|
856
|
+
index=target_index,
|
|
822
857
|
id=mk_item_id(item_id, collection_id),
|
|
823
858
|
document=item,
|
|
824
859
|
refresh=refresh,
|
|
@@ -897,13 +932,28 @@ class DatabaseLogic(BaseDatabaseLogic):
|
|
|
897
932
|
script = operations_to_script(script_operations)
|
|
898
933
|
|
|
899
934
|
try:
|
|
900
|
-
await self.client.
|
|
935
|
+
search_response = await self.client.search(
|
|
901
936
|
index=index_alias_by_collection_id(collection_id),
|
|
937
|
+
body={
|
|
938
|
+
"query": {"term": {"_id": mk_item_id(item_id, collection_id)}},
|
|
939
|
+
"size": 1,
|
|
940
|
+
},
|
|
941
|
+
)
|
|
942
|
+
if search_response["hits"]["total"]["value"] == 0:
|
|
943
|
+
raise NotFoundError(
|
|
944
|
+
f"Item {item_id} does not exist inside Collection {collection_id}"
|
|
945
|
+
)
|
|
946
|
+
document_index = search_response["hits"]["hits"][0]["_index"]
|
|
947
|
+
await self.client.update(
|
|
948
|
+
index=document_index,
|
|
902
949
|
id=mk_item_id(item_id, collection_id),
|
|
903
950
|
script=script,
|
|
904
951
|
refresh=True,
|
|
905
952
|
)
|
|
906
|
-
|
|
953
|
+
except ESNotFoundError:
|
|
954
|
+
raise NotFoundError(
|
|
955
|
+
f"Item {item_id} does not exist inside Collection {collection_id}"
|
|
956
|
+
)
|
|
907
957
|
except BadRequestError as exc:
|
|
908
958
|
raise HTTPException(
|
|
909
959
|
status_code=400, detail=exc.info["error"]["caused_by"]
|
|
@@ -914,7 +964,9 @@ class DatabaseLogic(BaseDatabaseLogic):
|
|
|
914
964
|
if new_collection_id:
|
|
915
965
|
await self.client.reindex(
|
|
916
966
|
body={
|
|
917
|
-
"dest": {
|
|
967
|
+
"dest": {
|
|
968
|
+
"index": f"{ITEMS_INDEX_PREFIX}{new_collection_id}"
|
|
969
|
+
}, # # noqa
|
|
918
970
|
"source": {
|
|
919
971
|
"index": f"{ITEMS_INDEX_PREFIX}{collection_id}",
|
|
920
972
|
"query": {"term": {"id": {"value": item_id}}},
|
|
@@ -922,8 +974,8 @@ class DatabaseLogic(BaseDatabaseLogic):
|
|
|
922
974
|
"script": {
|
|
923
975
|
"lang": "painless",
|
|
924
976
|
"source": (
|
|
925
|
-
f"""ctx._id = ctx._id.replace('{collection_id}', '{new_collection_id}');"""
|
|
926
|
-
f"""ctx._source.collection = '{new_collection_id}';"""
|
|
977
|
+
f"""ctx._id = ctx._id.replace('{collection_id}', '{new_collection_id}');""" # noqa
|
|
978
|
+
f"""ctx._source.collection = '{new_collection_id}';""" # noqa
|
|
927
979
|
),
|
|
928
980
|
},
|
|
929
981
|
},
|
|
@@ -983,9 +1035,9 @@ class DatabaseLogic(BaseDatabaseLogic):
|
|
|
983
1035
|
|
|
984
1036
|
try:
|
|
985
1037
|
# Perform the delete operation
|
|
986
|
-
await self.client.
|
|
1038
|
+
await self.client.delete_by_query(
|
|
987
1039
|
index=index_alias_by_collection_id(collection_id),
|
|
988
|
-
|
|
1040
|
+
body={"query": {"term": {"_id": mk_item_id(item_id, collection_id)}}},
|
|
989
1041
|
refresh=refresh,
|
|
990
1042
|
)
|
|
991
1043
|
except ESNotFoundError:
|
|
@@ -1085,8 +1137,10 @@ class DatabaseLogic(BaseDatabaseLogic):
|
|
|
1085
1137
|
refresh=refresh,
|
|
1086
1138
|
)
|
|
1087
1139
|
|
|
1088
|
-
|
|
1089
|
-
|
|
1140
|
+
if self.async_index_inserter.should_create_collection_index():
|
|
1141
|
+
await self.async_index_inserter.create_simple_index(
|
|
1142
|
+
self.client, collection_id
|
|
1143
|
+
)
|
|
1090
1144
|
|
|
1091
1145
|
async def find_collection(self, collection_id: str) -> Collection:
|
|
1092
1146
|
"""Find and return a collection from the database.
|
|
@@ -1360,9 +1414,12 @@ class DatabaseLogic(BaseDatabaseLogic):
|
|
|
1360
1414
|
|
|
1361
1415
|
# Perform the bulk insert
|
|
1362
1416
|
raise_on_error = self.async_settings.raise_on_bulk_error
|
|
1417
|
+
actions = await self.async_index_inserter.prepare_bulk_actions(
|
|
1418
|
+
collection_id, processed_items
|
|
1419
|
+
)
|
|
1363
1420
|
success, errors = await helpers.async_bulk(
|
|
1364
1421
|
self.client,
|
|
1365
|
-
|
|
1422
|
+
actions,
|
|
1366
1423
|
refresh=refresh,
|
|
1367
1424
|
raise_on_error=raise_on_error,
|
|
1368
1425
|
)
|
|
@@ -1,2 +1,2 @@
|
|
|
1
1
|
"""library version."""
|
|
2
|
-
__version__ = "6.
|
|
2
|
+
__version__ = "6.2.0"
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.1
|
|
2
2
|
Name: stac-fastapi-elasticsearch
|
|
3
|
-
Version: 6.
|
|
3
|
+
Version: 6.2.0
|
|
4
4
|
Summary: An implementation of STAC API based on the FastAPI framework with both Elasticsearch and Opensearch.
|
|
5
5
|
Home-page: https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch
|
|
6
6
|
License: MIT
|
|
@@ -36,7 +36,7 @@ Provides-Extra: server
|
|
|
36
36
|
[](https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch/network/members)
|
|
37
37
|
[](https://pypi.org/project/stac-fastapi-elasticsearch/)
|
|
38
38
|
[](https://github.com/radiantearth/stac-spec/tree/v1.1.0)
|
|
39
|
-
[](https://github.com/stac-utils/stac-fastapi)
|
|
40
40
|
|
|
41
41
|
## Sponsors & Supporters
|
|
42
42
|
|
|
@@ -106,6 +106,7 @@ This project is built on the following technologies: STAC, stac-fastapi, FastAPI
|
|
|
106
106
|
- [Auth](#auth)
|
|
107
107
|
- [Aggregation](#aggregation)
|
|
108
108
|
- [Rate Limiting](#rate-limiting)
|
|
109
|
+
- [Datetime-Based Index Management](#datetime-based-index-management)
|
|
109
110
|
|
|
110
111
|
## Documentation & Resources
|
|
111
112
|
|
|
@@ -226,28 +227,105 @@ You can customize additional settings in your `.env` file:
|
|
|
226
227
|
|------------------------------|--------------------------------------------------------------------------------------|--------------------------|---------------------------------------------------------------------------------------------|
|
|
227
228
|
| `ES_HOST` | Hostname for external Elasticsearch/OpenSearch. | `localhost` | Optional |
|
|
228
229
|
| `ES_PORT` | Port for Elasticsearch/OpenSearch. | `9200` (ES) / `9202` (OS)| Optional |
|
|
229
|
-
| `ES_USE_SSL` | Use SSL for connecting to Elasticsearch/OpenSearch. | `
|
|
230
|
-
| `ES_VERIFY_CERTS` | Verify SSL certificates when connecting. | `
|
|
230
|
+
| `ES_USE_SSL` | Use SSL for connecting to Elasticsearch/OpenSearch. | `true` | Optional |
|
|
231
|
+
| `ES_VERIFY_CERTS` | Verify SSL certificates when connecting. | `true` | Optional |
|
|
232
|
+
| `ES_API_KEY` | API Key for external Elasticsearch/OpenSearch. | N/A | Optional |
|
|
233
|
+
| `ES_TIMEOUT` | Client timeout for Elasticsearch/OpenSearch. | DB client default | Optional |
|
|
231
234
|
| `STAC_FASTAPI_TITLE` | Title of the API in the documentation. | `stac-fastapi-<backend>` | Optional |
|
|
232
235
|
| `STAC_FASTAPI_DESCRIPTION` | Description of the API in the documentation. | N/A | Optional |
|
|
233
236
|
| `STAC_FASTAPI_VERSION` | API version. | `2.1` | Optional |
|
|
234
|
-
| `STAC_FASTAPI_LANDING_PAGE_ID` | Landing page ID
|
|
237
|
+
| `STAC_FASTAPI_LANDING_PAGE_ID` | Landing page ID | `stac-fastapi` | Optional |
|
|
235
238
|
| `APP_HOST` | Server bind address. | `0.0.0.0` | Optional |
|
|
236
|
-
| `APP_PORT` | Server port. | `
|
|
239
|
+
| `APP_PORT` | Server port. | `8000` | Optional |
|
|
237
240
|
| `ENVIRONMENT` | Runtime environment. | `local` | Optional |
|
|
238
241
|
| `WEB_CONCURRENCY` | Number of worker processes. | `10` | Optional |
|
|
239
242
|
| `RELOAD` | Enable auto-reload for development. | `true` | Optional |
|
|
240
243
|
| `STAC_FASTAPI_RATE_LIMIT` | API rate limit per client. | `200/minute` | Optional |
|
|
241
|
-
| `BACKEND` | Tests-related variable | `elasticsearch` or `opensearch` based on the backend | Optional
|
|
242
|
-
| `ELASTICSEARCH_VERSION`
|
|
243
|
-
| `OPENSEARCH_VERSION` | OpenSearch version | `2.11.1` | Optional
|
|
244
|
-
| `ENABLE_DIRECT_RESPONSE`
|
|
245
|
-
| `RAISE_ON_BULK_ERROR`
|
|
246
|
-
| `DATABASE_REFRESH`
|
|
244
|
+
| `BACKEND` | Tests-related variable | `elasticsearch` or `opensearch` based on the backend | Optional |
|
|
245
|
+
| `ELASTICSEARCH_VERSION` | Version of Elasticsearch to use. | `8.11.0` | Optional |
|
|
246
|
+
| `OPENSEARCH_VERSION` | OpenSearch version | `2.11.1` | Optional |
|
|
247
|
+
| `ENABLE_DIRECT_RESPONSE` | Enable direct response for maximum performance (disables all FastAPI dependencies, including authentication, custom status codes, and validation) | `false` | Optional |
|
|
248
|
+
| `RAISE_ON_BULK_ERROR` | Controls whether bulk insert operations raise exceptions on errors. If set to `true`, the operation will stop and raise an exception when an error occurs. If set to `false`, errors will be logged, and the operation will continue. **Note:** STAC Item and ItemCollection validation errors will always raise, regardless of this flag. | `false` | Optional |
|
|
249
|
+
| `DATABASE_REFRESH` | Controls whether database operations refresh the index immediately after changes. If set to `true`, changes will be immediately searchable. If set to `false`, changes may not be immediately visible but can improve performance for bulk operations. If set to `wait_for`, changes will wait for the next refresh cycle to become visible. | `false` | Optional |
|
|
247
250
|
| `ENABLE_TRANSACTIONS_EXTENSIONS` | Enables or disables the Transactions and Bulk Transactions API extensions. If set to `false`, the POST `/collections` route and related transaction endpoints (including bulk transaction operations) will be unavailable in the API. This is useful for deployments where mutating the catalog via the API should be prevented. | `true` | Optional |
|
|
248
251
|
|
|
249
252
|
> [!NOTE]
|
|
250
|
-
> The variables `ES_HOST`, `ES_PORT`, `ES_USE_SSL`, and `
|
|
253
|
+
> The variables `ES_HOST`, `ES_PORT`, `ES_USE_SSL`, `ES_VERIFY_CERTS` and `ES_TIMEOUT` apply to both Elasticsearch and OpenSearch backends, so there is no need to rename the key names to `OS_` even if you're using OpenSearch.
|
|
254
|
+
|
|
255
|
+
## Datetime-Based Index Management
|
|
256
|
+
|
|
257
|
+
### Overview
|
|
258
|
+
|
|
259
|
+
SFEOS supports two indexing strategies for managing STAC items:
|
|
260
|
+
|
|
261
|
+
1. **Simple Indexing** (default) - One index per collection
|
|
262
|
+
2. **Datetime-Based Indexing** - Time-partitioned indexes with automatic management
|
|
263
|
+
|
|
264
|
+
The datetime-based indexing strategy is particularly useful for large temporal datasets. When a user provides a datetime parameter in a query, the system knows exactly which index to search, providing **multiple times faster searches** and significantly **reducing database load**.
|
|
265
|
+
|
|
266
|
+
### When to Use
|
|
267
|
+
|
|
268
|
+
**Recommended for:**
|
|
269
|
+
- Systems with large collections containing millions of items
|
|
270
|
+
- Systems requiring high-performance temporal searching
|
|
271
|
+
|
|
272
|
+
**Pros:**
|
|
273
|
+
- Multiple times faster queries with datetime filter
|
|
274
|
+
- Reduced database load - only relevant indexes are searched
|
|
275
|
+
|
|
276
|
+
**Cons:**
|
|
277
|
+
- Slightly longer item indexing time (automatic index management)
|
|
278
|
+
- Greater management complexity
|
|
279
|
+
|
|
280
|
+
### Configuration
|
|
281
|
+
|
|
282
|
+
#### Enabling Datetime-Based Indexing
|
|
283
|
+
|
|
284
|
+
Enable datetime-based indexing by setting the following environment variable:
|
|
285
|
+
|
|
286
|
+
```bash
|
|
287
|
+
ENABLE_DATETIME_INDEX_FILTERING=true
|
|
288
|
+
```
|
|
289
|
+
|
|
290
|
+
### Related Configuration Variables
|
|
291
|
+
|
|
292
|
+
| Variable | Description | Default | Example |
|
|
293
|
+
|----------|-------------|---------|---------|
|
|
294
|
+
| `ENABLE_DATETIME_INDEX_FILTERING` | Enables time-based index partitioning | `false` | `true` |
|
|
295
|
+
| `DATETIME_INDEX_MAX_SIZE_GB` | Maximum size limit for datetime indexes (GB) - note: add +20% to target size due to ES/OS compression | `25` | `50` |
|
|
296
|
+
| `STAC_ITEMS_INDEX_PREFIX` | Prefix for item indexes | `items_` | `stac_items_` |
|
|
297
|
+
|
|
298
|
+
## How Datetime-Based Indexing Works
|
|
299
|
+
|
|
300
|
+
### Index and Alias Naming Convention
|
|
301
|
+
|
|
302
|
+
The system uses a precise naming convention:
|
|
303
|
+
|
|
304
|
+
**Physical indexes:**
|
|
305
|
+
```
|
|
306
|
+
{ITEMS_INDEX_PREFIX}{collection-id}_{uuid4}
|
|
307
|
+
```
|
|
308
|
+
|
|
309
|
+
**Aliases:**
|
|
310
|
+
```
|
|
311
|
+
{ITEMS_INDEX_PREFIX}{collection-id} # Main collection alias
|
|
312
|
+
{ITEMS_INDEX_PREFIX}{collection-id}_{start-datetime} # Temporal alias
|
|
313
|
+
{ITEMS_INDEX_PREFIX}{collection-id}_{start-datetime}_{end-datetime} # Closed index alias
|
|
314
|
+
```
|
|
315
|
+
|
|
316
|
+
**Example:**
|
|
317
|
+
|
|
318
|
+
*Physical indexes:*
|
|
319
|
+
- `items_sentinel-2-l2a_a1b2c3d4-e5f6-7890-abcd-ef1234567890`
|
|
320
|
+
|
|
321
|
+
*Aliases:*
|
|
322
|
+
- `items_sentinel-2-l2a` - main collection alias
|
|
323
|
+
- `items_sentinel-2-l2a_2024-01-01` - active alias from January 1, 2024
|
|
324
|
+
- `items_sentinel-2-l2a_2024-01-01_2024-03-15` - closed index alias (reached size limit)
|
|
325
|
+
|
|
326
|
+
### Index Size Management
|
|
327
|
+
|
|
328
|
+
**Important - Data Compression:** Elasticsearch and OpenSearch automatically compress data. The configured `DATETIME_INDEX_MAX_SIZE_GB` limit refers to the compressed size on disk. It is recommended to add +20% to the target size to account for compression overhead and metadata.
|
|
251
329
|
|
|
252
330
|
## Interacting with the API
|
|
253
331
|
|
|
@@ -557,4 +635,3 @@ You can customize additional settings in your `.env` file:
|
|
|
557
635
|
- Ensures fair resource allocation among all clients
|
|
558
636
|
|
|
559
637
|
- **Examples**: Implementation examples are available in the [examples/rate_limit](examples/rate_limit) directory.
|
|
560
|
-
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|