stac-fastapi-opensearch 6.0.0__tar.gz → 6.2.0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- {stac_fastapi_opensearch-6.0.0 → stac_fastapi_opensearch-6.2.0}/PKG-INFO +91 -14
- {stac_fastapi_opensearch-6.0.0 → stac_fastapi_opensearch-6.2.0}/README.md +90 -13
- {stac_fastapi_opensearch-6.0.0 → stac_fastapi_opensearch-6.2.0}/setup.py +2 -2
- {stac_fastapi_opensearch-6.0.0 → stac_fastapi_opensearch-6.2.0}/stac_fastapi/opensearch/app.py +11 -1
- {stac_fastapi_opensearch-6.0.0 → stac_fastapi_opensearch-6.2.0}/stac_fastapi/opensearch/config.py +5 -0
- {stac_fastapi_opensearch-6.0.0 → stac_fastapi_opensearch-6.2.0}/stac_fastapi/opensearch/database_logic.py +98 -68
- {stac_fastapi_opensearch-6.0.0 → stac_fastapi_opensearch-6.2.0}/stac_fastapi/opensearch/version.py +1 -1
- {stac_fastapi_opensearch-6.0.0 → stac_fastapi_opensearch-6.2.0}/stac_fastapi_opensearch.egg-info/PKG-INFO +91 -14
- {stac_fastapi_opensearch-6.0.0 → stac_fastapi_opensearch-6.2.0}/stac_fastapi_opensearch.egg-info/requires.txt +2 -2
- {stac_fastapi_opensearch-6.0.0 → stac_fastapi_opensearch-6.2.0}/setup.cfg +0 -0
- {stac_fastapi_opensearch-6.0.0 → stac_fastapi_opensearch-6.2.0}/stac_fastapi/opensearch/__init__.py +0 -0
- {stac_fastapi_opensearch-6.0.0 → stac_fastapi_opensearch-6.2.0}/stac_fastapi_opensearch.egg-info/SOURCES.txt +0 -0
- {stac_fastapi_opensearch-6.0.0 → stac_fastapi_opensearch-6.2.0}/stac_fastapi_opensearch.egg-info/dependency_links.txt +0 -0
- {stac_fastapi_opensearch-6.0.0 → stac_fastapi_opensearch-6.2.0}/stac_fastapi_opensearch.egg-info/entry_points.txt +0 -0
- {stac_fastapi_opensearch-6.0.0 → stac_fastapi_opensearch-6.2.0}/stac_fastapi_opensearch.egg-info/not-zip-safe +0 -0
- {stac_fastapi_opensearch-6.0.0 → stac_fastapi_opensearch-6.2.0}/stac_fastapi_opensearch.egg-info/top_level.txt +0 -0
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.1
|
|
2
2
|
Name: stac_fastapi_opensearch
|
|
3
|
-
Version: 6.
|
|
3
|
+
Version: 6.2.0
|
|
4
4
|
Summary: Opensearch stac-fastapi backend.
|
|
5
5
|
Home-page: https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch
|
|
6
6
|
License: MIT
|
|
@@ -36,7 +36,7 @@ Provides-Extra: server
|
|
|
36
36
|
[](https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch/network/members)
|
|
37
37
|
[](https://pypi.org/project/stac-fastapi-elasticsearch/)
|
|
38
38
|
[](https://github.com/radiantearth/stac-spec/tree/v1.1.0)
|
|
39
|
-
[](https://github.com/stac-utils/stac-fastapi)
|
|
40
40
|
|
|
41
41
|
## Sponsors & Supporters
|
|
42
42
|
|
|
@@ -106,6 +106,7 @@ This project is built on the following technologies: STAC, stac-fastapi, FastAPI
|
|
|
106
106
|
- [Auth](#auth)
|
|
107
107
|
- [Aggregation](#aggregation)
|
|
108
108
|
- [Rate Limiting](#rate-limiting)
|
|
109
|
+
- [Datetime-Based Index Management](#datetime-based-index-management)
|
|
109
110
|
|
|
110
111
|
## Documentation & Resources
|
|
111
112
|
|
|
@@ -226,28 +227,105 @@ You can customize additional settings in your `.env` file:
|
|
|
226
227
|
|------------------------------|--------------------------------------------------------------------------------------|--------------------------|---------------------------------------------------------------------------------------------|
|
|
227
228
|
| `ES_HOST` | Hostname for external Elasticsearch/OpenSearch. | `localhost` | Optional |
|
|
228
229
|
| `ES_PORT` | Port for Elasticsearch/OpenSearch. | `9200` (ES) / `9202` (OS)| Optional |
|
|
229
|
-
| `ES_USE_SSL` | Use SSL for connecting to Elasticsearch/OpenSearch. | `
|
|
230
|
-
| `ES_VERIFY_CERTS` | Verify SSL certificates when connecting. | `
|
|
230
|
+
| `ES_USE_SSL` | Use SSL for connecting to Elasticsearch/OpenSearch. | `true` | Optional |
|
|
231
|
+
| `ES_VERIFY_CERTS` | Verify SSL certificates when connecting. | `true` | Optional |
|
|
232
|
+
| `ES_API_KEY` | API Key for external Elasticsearch/OpenSearch. | N/A | Optional |
|
|
233
|
+
| `ES_TIMEOUT` | Client timeout for Elasticsearch/OpenSearch. | DB client default | Optional |
|
|
231
234
|
| `STAC_FASTAPI_TITLE` | Title of the API in the documentation. | `stac-fastapi-<backend>` | Optional |
|
|
232
235
|
| `STAC_FASTAPI_DESCRIPTION` | Description of the API in the documentation. | N/A | Optional |
|
|
233
236
|
| `STAC_FASTAPI_VERSION` | API version. | `2.1` | Optional |
|
|
234
|
-
| `STAC_FASTAPI_LANDING_PAGE_ID` | Landing page ID
|
|
237
|
+
| `STAC_FASTAPI_LANDING_PAGE_ID` | Landing page ID | `stac-fastapi` | Optional |
|
|
235
238
|
| `APP_HOST` | Server bind address. | `0.0.0.0` | Optional |
|
|
236
|
-
| `APP_PORT` | Server port. | `
|
|
239
|
+
| `APP_PORT` | Server port. | `8000` | Optional |
|
|
237
240
|
| `ENVIRONMENT` | Runtime environment. | `local` | Optional |
|
|
238
241
|
| `WEB_CONCURRENCY` | Number of worker processes. | `10` | Optional |
|
|
239
242
|
| `RELOAD` | Enable auto-reload for development. | `true` | Optional |
|
|
240
243
|
| `STAC_FASTAPI_RATE_LIMIT` | API rate limit per client. | `200/minute` | Optional |
|
|
241
|
-
| `BACKEND` | Tests-related variable | `elasticsearch` or `opensearch` based on the backend | Optional
|
|
242
|
-
| `ELASTICSEARCH_VERSION`
|
|
243
|
-
| `OPENSEARCH_VERSION` | OpenSearch version | `2.11.1` | Optional
|
|
244
|
-
| `ENABLE_DIRECT_RESPONSE`
|
|
245
|
-
| `RAISE_ON_BULK_ERROR`
|
|
246
|
-
| `DATABASE_REFRESH`
|
|
244
|
+
| `BACKEND` | Tests-related variable | `elasticsearch` or `opensearch` based on the backend | Optional |
|
|
245
|
+
| `ELASTICSEARCH_VERSION` | Version of Elasticsearch to use. | `8.11.0` | Optional |
|
|
246
|
+
| `OPENSEARCH_VERSION` | OpenSearch version | `2.11.1` | Optional |
|
|
247
|
+
| `ENABLE_DIRECT_RESPONSE` | Enable direct response for maximum performance (disables all FastAPI dependencies, including authentication, custom status codes, and validation) | `false` | Optional |
|
|
248
|
+
| `RAISE_ON_BULK_ERROR` | Controls whether bulk insert operations raise exceptions on errors. If set to `true`, the operation will stop and raise an exception when an error occurs. If set to `false`, errors will be logged, and the operation will continue. **Note:** STAC Item and ItemCollection validation errors will always raise, regardless of this flag. | `false` | Optional |
|
|
249
|
+
| `DATABASE_REFRESH` | Controls whether database operations refresh the index immediately after changes. If set to `true`, changes will be immediately searchable. If set to `false`, changes may not be immediately visible but can improve performance for bulk operations. If set to `wait_for`, changes will wait for the next refresh cycle to become visible. | `false` | Optional |
|
|
247
250
|
| `ENABLE_TRANSACTIONS_EXTENSIONS` | Enables or disables the Transactions and Bulk Transactions API extensions. If set to `false`, the POST `/collections` route and related transaction endpoints (including bulk transaction operations) will be unavailable in the API. This is useful for deployments where mutating the catalog via the API should be prevented. | `true` | Optional |
|
|
248
251
|
|
|
249
252
|
> [!NOTE]
|
|
250
|
-
> The variables `ES_HOST`, `ES_PORT`, `ES_USE_SSL`, and `
|
|
253
|
+
> The variables `ES_HOST`, `ES_PORT`, `ES_USE_SSL`, `ES_VERIFY_CERTS` and `ES_TIMEOUT` apply to both Elasticsearch and OpenSearch backends, so there is no need to rename the key names to `OS_` even if you're using OpenSearch.
|
|
254
|
+
|
|
255
|
+
## Datetime-Based Index Management
|
|
256
|
+
|
|
257
|
+
### Overview
|
|
258
|
+
|
|
259
|
+
SFEOS supports two indexing strategies for managing STAC items:
|
|
260
|
+
|
|
261
|
+
1. **Simple Indexing** (default) - One index per collection
|
|
262
|
+
2. **Datetime-Based Indexing** - Time-partitioned indexes with automatic management
|
|
263
|
+
|
|
264
|
+
The datetime-based indexing strategy is particularly useful for large temporal datasets. When a user provides a datetime parameter in a query, the system knows exactly which index to search, providing **multiple times faster searches** and significantly **reducing database load**.
|
|
265
|
+
|
|
266
|
+
### When to Use
|
|
267
|
+
|
|
268
|
+
**Recommended for:**
|
|
269
|
+
- Systems with large collections containing millions of items
|
|
270
|
+
- Systems requiring high-performance temporal searching
|
|
271
|
+
|
|
272
|
+
**Pros:**
|
|
273
|
+
- Multiple times faster queries with datetime filter
|
|
274
|
+
- Reduced database load - only relevant indexes are searched
|
|
275
|
+
|
|
276
|
+
**Cons:**
|
|
277
|
+
- Slightly longer item indexing time (automatic index management)
|
|
278
|
+
- Greater management complexity
|
|
279
|
+
|
|
280
|
+
### Configuration
|
|
281
|
+
|
|
282
|
+
#### Enabling Datetime-Based Indexing
|
|
283
|
+
|
|
284
|
+
Enable datetime-based indexing by setting the following environment variable:
|
|
285
|
+
|
|
286
|
+
```bash
|
|
287
|
+
ENABLE_DATETIME_INDEX_FILTERING=true
|
|
288
|
+
```
|
|
289
|
+
|
|
290
|
+
### Related Configuration Variables
|
|
291
|
+
|
|
292
|
+
| Variable | Description | Default | Example |
|
|
293
|
+
|----------|-------------|---------|---------|
|
|
294
|
+
| `ENABLE_DATETIME_INDEX_FILTERING` | Enables time-based index partitioning | `false` | `true` |
|
|
295
|
+
| `DATETIME_INDEX_MAX_SIZE_GB` | Maximum size limit for datetime indexes (GB) - note: add +20% to target size due to ES/OS compression | `25` | `50` |
|
|
296
|
+
| `STAC_ITEMS_INDEX_PREFIX` | Prefix for item indexes | `items_` | `stac_items_` |
|
|
297
|
+
|
|
298
|
+
## How Datetime-Based Indexing Works
|
|
299
|
+
|
|
300
|
+
### Index and Alias Naming Convention
|
|
301
|
+
|
|
302
|
+
The system uses a precise naming convention:
|
|
303
|
+
|
|
304
|
+
**Physical indexes:**
|
|
305
|
+
```
|
|
306
|
+
{ITEMS_INDEX_PREFIX}{collection-id}_{uuid4}
|
|
307
|
+
```
|
|
308
|
+
|
|
309
|
+
**Aliases:**
|
|
310
|
+
```
|
|
311
|
+
{ITEMS_INDEX_PREFIX}{collection-id} # Main collection alias
|
|
312
|
+
{ITEMS_INDEX_PREFIX}{collection-id}_{start-datetime} # Temporal alias
|
|
313
|
+
{ITEMS_INDEX_PREFIX}{collection-id}_{start-datetime}_{end-datetime} # Closed index alias
|
|
314
|
+
```
|
|
315
|
+
|
|
316
|
+
**Example:**
|
|
317
|
+
|
|
318
|
+
*Physical indexes:*
|
|
319
|
+
- `items_sentinel-2-l2a_a1b2c3d4-e5f6-7890-abcd-ef1234567890`
|
|
320
|
+
|
|
321
|
+
*Aliases:*
|
|
322
|
+
- `items_sentinel-2-l2a` - main collection alias
|
|
323
|
+
- `items_sentinel-2-l2a_2024-01-01` - active alias from January 1, 2024
|
|
324
|
+
- `items_sentinel-2-l2a_2024-01-01_2024-03-15` - closed index alias (reached size limit)
|
|
325
|
+
|
|
326
|
+
### Index Size Management
|
|
327
|
+
|
|
328
|
+
**Important - Data Compression:** Elasticsearch and OpenSearch automatically compress data. The configured `DATETIME_INDEX_MAX_SIZE_GB` limit refers to the compressed size on disk. It is recommended to add +20% to the target size to account for compression overhead and metadata.
|
|
251
329
|
|
|
252
330
|
## Interacting with the API
|
|
253
331
|
|
|
@@ -557,4 +635,3 @@ You can customize additional settings in your `.env` file:
|
|
|
557
635
|
- Ensures fair resource allocation among all clients
|
|
558
636
|
|
|
559
637
|
- **Examples**: Implementation examples are available in the [examples/rate_limit](examples/rate_limit) directory.
|
|
560
|
-
|
|
@@ -15,7 +15,7 @@
|
|
|
15
15
|
[](https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch/network/members)
|
|
16
16
|
[](https://pypi.org/project/stac-fastapi-elasticsearch/)
|
|
17
17
|
[](https://github.com/radiantearth/stac-spec/tree/v1.1.0)
|
|
18
|
-
[](https://github.com/stac-utils/stac-fastapi)
|
|
19
19
|
|
|
20
20
|
## Sponsors & Supporters
|
|
21
21
|
|
|
@@ -85,6 +85,7 @@ This project is built on the following technologies: STAC, stac-fastapi, FastAPI
|
|
|
85
85
|
- [Auth](#auth)
|
|
86
86
|
- [Aggregation](#aggregation)
|
|
87
87
|
- [Rate Limiting](#rate-limiting)
|
|
88
|
+
- [Datetime-Based Index Management](#datetime-based-index-management)
|
|
88
89
|
|
|
89
90
|
## Documentation & Resources
|
|
90
91
|
|
|
@@ -205,28 +206,105 @@ You can customize additional settings in your `.env` file:
|
|
|
205
206
|
|------------------------------|--------------------------------------------------------------------------------------|--------------------------|---------------------------------------------------------------------------------------------|
|
|
206
207
|
| `ES_HOST` | Hostname for external Elasticsearch/OpenSearch. | `localhost` | Optional |
|
|
207
208
|
| `ES_PORT` | Port for Elasticsearch/OpenSearch. | `9200` (ES) / `9202` (OS)| Optional |
|
|
208
|
-
| `ES_USE_SSL` | Use SSL for connecting to Elasticsearch/OpenSearch. | `
|
|
209
|
-
| `ES_VERIFY_CERTS` | Verify SSL certificates when connecting. | `
|
|
209
|
+
| `ES_USE_SSL` | Use SSL for connecting to Elasticsearch/OpenSearch. | `true` | Optional |
|
|
210
|
+
| `ES_VERIFY_CERTS` | Verify SSL certificates when connecting. | `true` | Optional |
|
|
211
|
+
| `ES_API_KEY` | API Key for external Elasticsearch/OpenSearch. | N/A | Optional |
|
|
212
|
+
| `ES_TIMEOUT` | Client timeout for Elasticsearch/OpenSearch. | DB client default | Optional |
|
|
210
213
|
| `STAC_FASTAPI_TITLE` | Title of the API in the documentation. | `stac-fastapi-<backend>` | Optional |
|
|
211
214
|
| `STAC_FASTAPI_DESCRIPTION` | Description of the API in the documentation. | N/A | Optional |
|
|
212
215
|
| `STAC_FASTAPI_VERSION` | API version. | `2.1` | Optional |
|
|
213
|
-
| `STAC_FASTAPI_LANDING_PAGE_ID` | Landing page ID
|
|
216
|
+
| `STAC_FASTAPI_LANDING_PAGE_ID` | Landing page ID | `stac-fastapi` | Optional |
|
|
214
217
|
| `APP_HOST` | Server bind address. | `0.0.0.0` | Optional |
|
|
215
|
-
| `APP_PORT` | Server port. | `
|
|
218
|
+
| `APP_PORT` | Server port. | `8000` | Optional |
|
|
216
219
|
| `ENVIRONMENT` | Runtime environment. | `local` | Optional |
|
|
217
220
|
| `WEB_CONCURRENCY` | Number of worker processes. | `10` | Optional |
|
|
218
221
|
| `RELOAD` | Enable auto-reload for development. | `true` | Optional |
|
|
219
222
|
| `STAC_FASTAPI_RATE_LIMIT` | API rate limit per client. | `200/minute` | Optional |
|
|
220
|
-
| `BACKEND` | Tests-related variable | `elasticsearch` or `opensearch` based on the backend | Optional
|
|
221
|
-
| `ELASTICSEARCH_VERSION`
|
|
222
|
-
| `OPENSEARCH_VERSION` | OpenSearch version | `2.11.1` | Optional
|
|
223
|
-
| `ENABLE_DIRECT_RESPONSE`
|
|
224
|
-
| `RAISE_ON_BULK_ERROR`
|
|
225
|
-
| `DATABASE_REFRESH`
|
|
223
|
+
| `BACKEND` | Tests-related variable | `elasticsearch` or `opensearch` based on the backend | Optional |
|
|
224
|
+
| `ELASTICSEARCH_VERSION` | Version of Elasticsearch to use. | `8.11.0` | Optional |
|
|
225
|
+
| `OPENSEARCH_VERSION` | OpenSearch version | `2.11.1` | Optional |
|
|
226
|
+
| `ENABLE_DIRECT_RESPONSE` | Enable direct response for maximum performance (disables all FastAPI dependencies, including authentication, custom status codes, and validation) | `false` | Optional |
|
|
227
|
+
| `RAISE_ON_BULK_ERROR` | Controls whether bulk insert operations raise exceptions on errors. If set to `true`, the operation will stop and raise an exception when an error occurs. If set to `false`, errors will be logged, and the operation will continue. **Note:** STAC Item and ItemCollection validation errors will always raise, regardless of this flag. | `false` | Optional |
|
|
228
|
+
| `DATABASE_REFRESH` | Controls whether database operations refresh the index immediately after changes. If set to `true`, changes will be immediately searchable. If set to `false`, changes may not be immediately visible but can improve performance for bulk operations. If set to `wait_for`, changes will wait for the next refresh cycle to become visible. | `false` | Optional |
|
|
226
229
|
| `ENABLE_TRANSACTIONS_EXTENSIONS` | Enables or disables the Transactions and Bulk Transactions API extensions. If set to `false`, the POST `/collections` route and related transaction endpoints (including bulk transaction operations) will be unavailable in the API. This is useful for deployments where mutating the catalog via the API should be prevented. | `true` | Optional |
|
|
227
230
|
|
|
228
231
|
> [!NOTE]
|
|
229
|
-
> The variables `ES_HOST`, `ES_PORT`, `ES_USE_SSL`, and `
|
|
232
|
+
> The variables `ES_HOST`, `ES_PORT`, `ES_USE_SSL`, `ES_VERIFY_CERTS` and `ES_TIMEOUT` apply to both Elasticsearch and OpenSearch backends, so there is no need to rename the key names to `OS_` even if you're using OpenSearch.
|
|
233
|
+
|
|
234
|
+
## Datetime-Based Index Management
|
|
235
|
+
|
|
236
|
+
### Overview
|
|
237
|
+
|
|
238
|
+
SFEOS supports two indexing strategies for managing STAC items:
|
|
239
|
+
|
|
240
|
+
1. **Simple Indexing** (default) - One index per collection
|
|
241
|
+
2. **Datetime-Based Indexing** - Time-partitioned indexes with automatic management
|
|
242
|
+
|
|
243
|
+
The datetime-based indexing strategy is particularly useful for large temporal datasets. When a user provides a datetime parameter in a query, the system knows exactly which index to search, providing **multiple times faster searches** and significantly **reducing database load**.
|
|
244
|
+
|
|
245
|
+
### When to Use
|
|
246
|
+
|
|
247
|
+
**Recommended for:**
|
|
248
|
+
- Systems with large collections containing millions of items
|
|
249
|
+
- Systems requiring high-performance temporal searching
|
|
250
|
+
|
|
251
|
+
**Pros:**
|
|
252
|
+
- Multiple times faster queries with datetime filter
|
|
253
|
+
- Reduced database load - only relevant indexes are searched
|
|
254
|
+
|
|
255
|
+
**Cons:**
|
|
256
|
+
- Slightly longer item indexing time (automatic index management)
|
|
257
|
+
- Greater management complexity
|
|
258
|
+
|
|
259
|
+
### Configuration
|
|
260
|
+
|
|
261
|
+
#### Enabling Datetime-Based Indexing
|
|
262
|
+
|
|
263
|
+
Enable datetime-based indexing by setting the following environment variable:
|
|
264
|
+
|
|
265
|
+
```bash
|
|
266
|
+
ENABLE_DATETIME_INDEX_FILTERING=true
|
|
267
|
+
```
|
|
268
|
+
|
|
269
|
+
### Related Configuration Variables
|
|
270
|
+
|
|
271
|
+
| Variable | Description | Default | Example |
|
|
272
|
+
|----------|-------------|---------|---------|
|
|
273
|
+
| `ENABLE_DATETIME_INDEX_FILTERING` | Enables time-based index partitioning | `false` | `true` |
|
|
274
|
+
| `DATETIME_INDEX_MAX_SIZE_GB` | Maximum size limit for datetime indexes (GB) - note: add +20% to target size due to ES/OS compression | `25` | `50` |
|
|
275
|
+
| `STAC_ITEMS_INDEX_PREFIX` | Prefix for item indexes | `items_` | `stac_items_` |
|
|
276
|
+
|
|
277
|
+
## How Datetime-Based Indexing Works
|
|
278
|
+
|
|
279
|
+
### Index and Alias Naming Convention
|
|
280
|
+
|
|
281
|
+
The system uses a precise naming convention:
|
|
282
|
+
|
|
283
|
+
**Physical indexes:**
|
|
284
|
+
```
|
|
285
|
+
{ITEMS_INDEX_PREFIX}{collection-id}_{uuid4}
|
|
286
|
+
```
|
|
287
|
+
|
|
288
|
+
**Aliases:**
|
|
289
|
+
```
|
|
290
|
+
{ITEMS_INDEX_PREFIX}{collection-id} # Main collection alias
|
|
291
|
+
{ITEMS_INDEX_PREFIX}{collection-id}_{start-datetime} # Temporal alias
|
|
292
|
+
{ITEMS_INDEX_PREFIX}{collection-id}_{start-datetime}_{end-datetime} # Closed index alias
|
|
293
|
+
```
|
|
294
|
+
|
|
295
|
+
**Example:**
|
|
296
|
+
|
|
297
|
+
*Physical indexes:*
|
|
298
|
+
- `items_sentinel-2-l2a_a1b2c3d4-e5f6-7890-abcd-ef1234567890`
|
|
299
|
+
|
|
300
|
+
*Aliases:*
|
|
301
|
+
- `items_sentinel-2-l2a` - main collection alias
|
|
302
|
+
- `items_sentinel-2-l2a_2024-01-01` - active alias from January 1, 2024
|
|
303
|
+
- `items_sentinel-2-l2a_2024-01-01_2024-03-15` - closed index alias (reached size limit)
|
|
304
|
+
|
|
305
|
+
### Index Size Management
|
|
306
|
+
|
|
307
|
+
**Important - Data Compression:** Elasticsearch and OpenSearch automatically compress data. The configured `DATETIME_INDEX_MAX_SIZE_GB` limit refers to the compressed size on disk. It is recommended to add +20% to the target size to account for compression overhead and metadata.
|
|
230
308
|
|
|
231
309
|
## Interacting with the API
|
|
232
310
|
|
|
@@ -536,4 +614,3 @@ You can customize additional settings in your `.env` file:
|
|
|
536
614
|
- Ensures fair resource allocation among all clients
|
|
537
615
|
|
|
538
616
|
- **Examples**: Implementation examples are available in the [examples/rate_limit](examples/rate_limit) directory.
|
|
539
|
-
|
|
@@ -6,8 +6,8 @@ with open("README.md") as f:
|
|
|
6
6
|
desc = f.read()
|
|
7
7
|
|
|
8
8
|
install_requires = [
|
|
9
|
-
"stac-fastapi-core==6.
|
|
10
|
-
"sfeos-helpers==6.
|
|
9
|
+
"stac-fastapi-core==6.2.0",
|
|
10
|
+
"sfeos-helpers==6.2.0",
|
|
11
11
|
"opensearch-py~=2.8.0",
|
|
12
12
|
"opensearch-py[async]~=2.8.0",
|
|
13
13
|
"uvicorn~=0.23.0",
|
{stac_fastapi_opensearch-6.0.0 → stac_fastapi_opensearch-6.2.0}/stac_fastapi/opensearch/app.py
RENAMED
|
@@ -25,6 +25,7 @@ from stac_fastapi.core.session import Session
|
|
|
25
25
|
from stac_fastapi.core.utilities import get_bool_env
|
|
26
26
|
from stac_fastapi.extensions.core import (
|
|
27
27
|
AggregationExtension,
|
|
28
|
+
CollectionSearchExtension,
|
|
28
29
|
FilterExtension,
|
|
29
30
|
FreeTextExtension,
|
|
30
31
|
SortExtension,
|
|
@@ -60,6 +61,14 @@ filter_extension.conformance_classes.append(
|
|
|
60
61
|
FilterConformanceClasses.ADVANCED_COMPARISON_OPERATORS
|
|
61
62
|
)
|
|
62
63
|
|
|
64
|
+
# Adding collection search extension for compatibility with stac-auth-proxy
|
|
65
|
+
# (https://github.com/developmentseed/stac-auth-proxy)
|
|
66
|
+
# The extension is not fully implemented yet but is required for collection filtering support
|
|
67
|
+
collection_search_extension = CollectionSearchExtension()
|
|
68
|
+
collection_search_extension.conformance_classes.append(
|
|
69
|
+
"https://api.stacspec.org/v1.0.0-rc.1/collection-search#filter"
|
|
70
|
+
)
|
|
71
|
+
|
|
63
72
|
aggregation_extension = AggregationExtension(
|
|
64
73
|
client=EsAsyncBaseAggregationClient(
|
|
65
74
|
database=database_logic, session=session, settings=settings
|
|
@@ -75,6 +84,7 @@ search_extensions = [
|
|
|
75
84
|
TokenPaginationExtension(),
|
|
76
85
|
filter_extension,
|
|
77
86
|
FreeTextExtension(),
|
|
87
|
+
collection_search_extension,
|
|
78
88
|
]
|
|
79
89
|
|
|
80
90
|
|
|
@@ -108,7 +118,7 @@ post_request_model = create_post_request_model(search_extensions)
|
|
|
108
118
|
app_config = {
|
|
109
119
|
"title": os.getenv("STAC_FASTAPI_TITLE", "stac-fastapi-opensearch"),
|
|
110
120
|
"description": os.getenv("STAC_FASTAPI_DESCRIPTION", "stac-fastapi-opensearch"),
|
|
111
|
-
"api_version": os.getenv("STAC_FASTAPI_VERSION", "6.
|
|
121
|
+
"api_version": os.getenv("STAC_FASTAPI_VERSION", "6.2.0"),
|
|
112
122
|
"settings": settings,
|
|
113
123
|
"extensions": extensions,
|
|
114
124
|
"client": CoreClient(
|
{stac_fastapi_opensearch-6.0.0 → stac_fastapi_opensearch-6.2.0}/stac_fastapi/opensearch/config.py
RENAMED
|
@@ -1,4 +1,5 @@
|
|
|
1
1
|
"""API configuration."""
|
|
2
|
+
|
|
2
3
|
import logging
|
|
3
4
|
import os
|
|
4
5
|
import ssl
|
|
@@ -53,6 +54,10 @@ def _es_config() -> Dict[str, Any]:
|
|
|
53
54
|
|
|
54
55
|
config["headers"] = headers
|
|
55
56
|
|
|
57
|
+
# Include timeout setting if set
|
|
58
|
+
if timeout := os.getenv("ES_TIMEOUT"):
|
|
59
|
+
config["timeout"] = timeout
|
|
60
|
+
|
|
56
61
|
# Explicitly exclude SSL settings when not using SSL
|
|
57
62
|
if not use_ssl:
|
|
58
63
|
return config
|
|
@@ -4,7 +4,7 @@ import asyncio
|
|
|
4
4
|
import logging
|
|
5
5
|
from base64 import urlsafe_b64decode, urlsafe_b64encode
|
|
6
6
|
from copy import deepcopy
|
|
7
|
-
from typing import Any, Dict, Iterable, List, Optional, Tuple, Type
|
|
7
|
+
from typing import Any, Dict, Iterable, List, Optional, Tuple, Type
|
|
8
8
|
|
|
9
9
|
import attr
|
|
10
10
|
import orjson
|
|
@@ -26,7 +26,7 @@ from stac_fastapi.opensearch.config import (
|
|
|
26
26
|
AsyncOpensearchSettings as AsyncSearchSettings,
|
|
27
27
|
)
|
|
28
28
|
from stac_fastapi.opensearch.config import OpensearchSettings as SyncSearchSettings
|
|
29
|
-
from stac_fastapi.sfeos_helpers import filter
|
|
29
|
+
from stac_fastapi.sfeos_helpers import filter as filter_module
|
|
30
30
|
from stac_fastapi.sfeos_helpers.database import (
|
|
31
31
|
apply_free_text_filter_shared,
|
|
32
32
|
apply_intersects_filter_shared,
|
|
@@ -34,14 +34,16 @@ from stac_fastapi.sfeos_helpers.database import (
|
|
|
34
34
|
delete_item_index_shared,
|
|
35
35
|
get_queryables_mapping_shared,
|
|
36
36
|
index_alias_by_collection_id,
|
|
37
|
-
index_by_collection_id,
|
|
38
|
-
indices,
|
|
39
37
|
mk_actions,
|
|
40
38
|
mk_item_id,
|
|
41
39
|
populate_sort_shared,
|
|
42
40
|
return_date,
|
|
43
41
|
validate_refresh,
|
|
44
42
|
)
|
|
43
|
+
from stac_fastapi.sfeos_helpers.database.query import (
|
|
44
|
+
ES_MAX_URL_LENGTH,
|
|
45
|
+
add_collections_to_body,
|
|
46
|
+
)
|
|
45
47
|
from stac_fastapi.sfeos_helpers.database.utils import (
|
|
46
48
|
merge_to_operations,
|
|
47
49
|
operations_to_script,
|
|
@@ -51,15 +53,18 @@ from stac_fastapi.sfeos_helpers.mappings import (
|
|
|
51
53
|
COLLECTIONS_INDEX,
|
|
52
54
|
DEFAULT_SORT,
|
|
53
55
|
ES_COLLECTIONS_MAPPINGS,
|
|
54
|
-
ES_ITEMS_MAPPINGS,
|
|
55
|
-
ES_ITEMS_SETTINGS,
|
|
56
56
|
ITEM_INDICES,
|
|
57
57
|
ITEMS_INDEX_PREFIX,
|
|
58
58
|
Geometry,
|
|
59
59
|
)
|
|
60
|
+
from stac_fastapi.sfeos_helpers.search_engine import (
|
|
61
|
+
BaseIndexInserter,
|
|
62
|
+
BaseIndexSelector,
|
|
63
|
+
IndexInsertionFactory,
|
|
64
|
+
IndexSelectorFactory,
|
|
65
|
+
)
|
|
60
66
|
from stac_fastapi.types.errors import ConflictError, NotFoundError
|
|
61
67
|
from stac_fastapi.types.links import resolve_links
|
|
62
|
-
from stac_fastapi.types.rfc3339 import DateTimeType
|
|
63
68
|
from stac_fastapi.types.stac import Collection, Item
|
|
64
69
|
|
|
65
70
|
logger = logging.getLogger(__name__)
|
|
@@ -100,33 +105,6 @@ async def create_collection_index() -> None:
|
|
|
100
105
|
await client.close()
|
|
101
106
|
|
|
102
107
|
|
|
103
|
-
async def create_item_index(collection_id: str) -> None:
|
|
104
|
-
"""
|
|
105
|
-
Create the index for Items. The settings of the index template will be used implicitly.
|
|
106
|
-
|
|
107
|
-
Args:
|
|
108
|
-
collection_id (str): Collection identifier.
|
|
109
|
-
|
|
110
|
-
Returns:
|
|
111
|
-
None
|
|
112
|
-
|
|
113
|
-
"""
|
|
114
|
-
client = AsyncSearchSettings().create_client
|
|
115
|
-
|
|
116
|
-
index_name = f"{index_by_collection_id(collection_id)}-000001"
|
|
117
|
-
exists = await client.indices.exists(index=index_name)
|
|
118
|
-
if not exists:
|
|
119
|
-
await client.indices.create(
|
|
120
|
-
index=index_name,
|
|
121
|
-
body={
|
|
122
|
-
"aliases": {index_alias_by_collection_id(collection_id): {}},
|
|
123
|
-
"mappings": ES_ITEMS_MAPPINGS,
|
|
124
|
-
"settings": ES_ITEMS_SETTINGS,
|
|
125
|
-
},
|
|
126
|
-
)
|
|
127
|
-
await client.close()
|
|
128
|
-
|
|
129
|
-
|
|
130
108
|
async def delete_item_index(collection_id: str) -> None:
|
|
131
109
|
"""Delete the index for items in a collection.
|
|
132
110
|
|
|
@@ -148,6 +126,9 @@ class DatabaseLogic(BaseDatabaseLogic):
|
|
|
148
126
|
async_settings: AsyncSearchSettings = attr.ib(factory=AsyncSearchSettings)
|
|
149
127
|
sync_settings: SyncSearchSettings = attr.ib(factory=SyncSearchSettings)
|
|
150
128
|
|
|
129
|
+
async_index_selector: BaseIndexSelector = attr.ib(init=False)
|
|
130
|
+
async_index_inserter: BaseIndexInserter = attr.ib(init=False)
|
|
131
|
+
|
|
151
132
|
client = attr.ib(init=False)
|
|
152
133
|
sync_client = attr.ib(init=False)
|
|
153
134
|
|
|
@@ -155,6 +136,10 @@ class DatabaseLogic(BaseDatabaseLogic):
|
|
|
155
136
|
"""Initialize clients after the class is instantiated."""
|
|
156
137
|
self.client = self.async_settings.create_client
|
|
157
138
|
self.sync_client = self.sync_settings.create_client
|
|
139
|
+
self.async_index_inserter = IndexInsertionFactory.create_insertion_strategy(
|
|
140
|
+
self.client
|
|
141
|
+
)
|
|
142
|
+
self.async_index_selector = IndexSelectorFactory.create_selector(self.client)
|
|
158
143
|
|
|
159
144
|
item_serializer: Type[ItemSerializer] = attr.ib(default=ItemSerializer)
|
|
160
145
|
collection_serializer: Type[CollectionSerializer] = attr.ib(
|
|
@@ -230,15 +215,23 @@ class DatabaseLogic(BaseDatabaseLogic):
|
|
|
230
215
|
with the index for the Collection as the target index and the combined `mk_item_id` as the document id.
|
|
231
216
|
"""
|
|
232
217
|
try:
|
|
233
|
-
|
|
218
|
+
response = await self.client.search(
|
|
234
219
|
index=index_alias_by_collection_id(collection_id),
|
|
235
|
-
|
|
220
|
+
body={
|
|
221
|
+
"query": {"term": {"_id": mk_item_id(item_id, collection_id)}},
|
|
222
|
+
"size": 1,
|
|
223
|
+
},
|
|
236
224
|
)
|
|
225
|
+
if response["hits"]["total"]["value"] == 0:
|
|
226
|
+
raise NotFoundError(
|
|
227
|
+
f"Item {item_id} does not exist inside Collection {collection_id}"
|
|
228
|
+
)
|
|
229
|
+
|
|
230
|
+
return response["hits"]["hits"][0]["_source"]
|
|
237
231
|
except exceptions.NotFoundError:
|
|
238
232
|
raise NotFoundError(
|
|
239
233
|
f"Item {item_id} does not exist inside Collection {collection_id}"
|
|
240
234
|
)
|
|
241
|
-
return item["_source"]
|
|
242
235
|
|
|
243
236
|
async def get_queryables_mapping(self, collection_id: str = "*") -> dict:
|
|
244
237
|
"""Retrieve mapping of Queryables for search.
|
|
@@ -292,31 +285,21 @@ class DatabaseLogic(BaseDatabaseLogic):
|
|
|
292
285
|
|
|
293
286
|
@staticmethod
|
|
294
287
|
def apply_datetime_filter(
|
|
295
|
-
search: Search,
|
|
296
|
-
) -> Search:
|
|
288
|
+
search: Search, datetime: Optional[str]
|
|
289
|
+
) -> Tuple[Search, Dict[str, Optional[str]]]:
|
|
297
290
|
"""Apply a filter to search on datetime, start_datetime, and end_datetime fields.
|
|
298
291
|
|
|
299
292
|
Args:
|
|
300
293
|
search: The search object to filter.
|
|
301
|
-
|
|
302
|
-
- A single datetime string (e.g., "2023-01-01T12:00:00")
|
|
303
|
-
- A datetime range string (e.g., "2023-01-01/2023-12-31")
|
|
304
|
-
- A datetime object
|
|
305
|
-
- A tuple of (start_datetime, end_datetime)
|
|
294
|
+
datetime: Optional[str]
|
|
306
295
|
|
|
307
296
|
Returns:
|
|
308
297
|
The filtered search object.
|
|
309
298
|
"""
|
|
310
|
-
|
|
311
|
-
return search
|
|
299
|
+
datetime_search = return_date(datetime)
|
|
312
300
|
|
|
313
|
-
|
|
314
|
-
|
|
315
|
-
datetime_search = return_date(interval)
|
|
316
|
-
except (ValueError, TypeError) as e:
|
|
317
|
-
# Handle invalid interval formats if return_date fails
|
|
318
|
-
logger.error(f"Invalid interval format: {interval}, error: {e}")
|
|
319
|
-
return search
|
|
301
|
+
if not datetime_search:
|
|
302
|
+
return search, datetime_search
|
|
320
303
|
|
|
321
304
|
if "eq" in datetime_search:
|
|
322
305
|
# For exact matches, include:
|
|
@@ -383,7 +366,10 @@ class DatabaseLogic(BaseDatabaseLogic):
|
|
|
383
366
|
),
|
|
384
367
|
]
|
|
385
368
|
|
|
386
|
-
return
|
|
369
|
+
return (
|
|
370
|
+
search.query(Q("bool", should=should, minimum_should_match=1)),
|
|
371
|
+
datetime_search,
|
|
372
|
+
)
|
|
387
373
|
|
|
388
374
|
@staticmethod
|
|
389
375
|
def apply_bbox_filter(search: Search, bbox: List):
|
|
@@ -480,7 +466,7 @@ class DatabaseLogic(BaseDatabaseLogic):
|
|
|
480
466
|
otherwise the original Search object.
|
|
481
467
|
"""
|
|
482
468
|
if _filter is not None:
|
|
483
|
-
es_query =
|
|
469
|
+
es_query = filter_module.to_es(await self.get_queryables_mapping(), _filter)
|
|
484
470
|
search = search.filter(es_query)
|
|
485
471
|
|
|
486
472
|
return search
|
|
@@ -507,6 +493,7 @@ class DatabaseLogic(BaseDatabaseLogic):
|
|
|
507
493
|
token: Optional[str],
|
|
508
494
|
sort: Optional[Dict[str, Dict[str, str]]],
|
|
509
495
|
collection_ids: Optional[List[str]],
|
|
496
|
+
datetime_search: Dict[str, Optional[str]],
|
|
510
497
|
ignore_unavailable: bool = True,
|
|
511
498
|
) -> Tuple[Iterable[Dict[str, Any]], Optional[int], Optional[str]]:
|
|
512
499
|
"""Execute a search query with limit and other optional parameters.
|
|
@@ -517,6 +504,7 @@ class DatabaseLogic(BaseDatabaseLogic):
|
|
|
517
504
|
token (Optional[str]): The token used to return the next set of results.
|
|
518
505
|
sort (Optional[Dict[str, Dict[str, str]]]): Specifies how the results should be sorted.
|
|
519
506
|
collection_ids (Optional[List[str]]): The collection ids to search.
|
|
507
|
+
datetime_search (Dict[str, Optional[str]]): Datetime range used for index selection.
|
|
520
508
|
ignore_unavailable (bool, optional): Whether to ignore unavailable collections. Defaults to True.
|
|
521
509
|
|
|
522
510
|
Returns:
|
|
@@ -532,6 +520,14 @@ class DatabaseLogic(BaseDatabaseLogic):
|
|
|
532
520
|
"""
|
|
533
521
|
search_body: Dict[str, Any] = {}
|
|
534
522
|
query = search.query.to_dict() if search.query else None
|
|
523
|
+
|
|
524
|
+
index_param = await self.async_index_selector.select_indexes(
|
|
525
|
+
collection_ids, datetime_search
|
|
526
|
+
)
|
|
527
|
+
if len(index_param) > ES_MAX_URL_LENGTH - 300:
|
|
528
|
+
index_param = ITEM_INDICES
|
|
529
|
+
query = add_collections_to_body(collection_ids, query)
|
|
530
|
+
|
|
535
531
|
if query:
|
|
536
532
|
search_body["query"] = query
|
|
537
533
|
|
|
@@ -544,8 +540,6 @@ class DatabaseLogic(BaseDatabaseLogic):
|
|
|
544
540
|
|
|
545
541
|
search_body["sort"] = sort if sort else DEFAULT_SORT
|
|
546
542
|
|
|
547
|
-
index_param = indices(collection_ids)
|
|
548
|
-
|
|
549
543
|
max_result_window = MAX_LIMIT
|
|
550
544
|
|
|
551
545
|
size_limit = min(limit + 1, max_result_window)
|
|
@@ -606,6 +600,7 @@ class DatabaseLogic(BaseDatabaseLogic):
|
|
|
606
600
|
geometry_geohash_grid_precision: int,
|
|
607
601
|
geometry_geotile_grid_precision: int,
|
|
608
602
|
datetime_frequency_interval: str,
|
|
603
|
+
datetime_search,
|
|
609
604
|
ignore_unavailable: Optional[bool] = True,
|
|
610
605
|
):
|
|
611
606
|
"""Return aggregations of STAC Items."""
|
|
@@ -639,7 +634,10 @@ class DatabaseLogic(BaseDatabaseLogic):
|
|
|
639
634
|
if k in aggregations
|
|
640
635
|
}
|
|
641
636
|
|
|
642
|
-
index_param =
|
|
637
|
+
index_param = await self.async_index_selector.select_indexes(
|
|
638
|
+
collection_ids, datetime_search
|
|
639
|
+
)
|
|
640
|
+
|
|
643
641
|
search_task = asyncio.create_task(
|
|
644
642
|
self.client.search(
|
|
645
643
|
index=index_param,
|
|
@@ -832,8 +830,13 @@ class DatabaseLogic(BaseDatabaseLogic):
|
|
|
832
830
|
item = await self.async_prep_create_item(
|
|
833
831
|
item=item, base_url=base_url, exist_ok=exist_ok
|
|
834
832
|
)
|
|
833
|
+
|
|
834
|
+
target_index = await self.async_index_inserter.get_target_index(
|
|
835
|
+
collection_id, item
|
|
836
|
+
)
|
|
837
|
+
|
|
835
838
|
await self.client.index(
|
|
836
|
-
index=
|
|
839
|
+
index=target_index,
|
|
837
840
|
id=mk_item_id(item_id, collection_id),
|
|
838
841
|
body=item,
|
|
839
842
|
refresh=refresh,
|
|
@@ -912,13 +915,28 @@ class DatabaseLogic(BaseDatabaseLogic):
|
|
|
912
915
|
script = operations_to_script(script_operations)
|
|
913
916
|
|
|
914
917
|
try:
|
|
915
|
-
await self.client.
|
|
918
|
+
search_response = await self.client.search(
|
|
916
919
|
index=index_alias_by_collection_id(collection_id),
|
|
920
|
+
body={
|
|
921
|
+
"query": {"term": {"_id": mk_item_id(item_id, collection_id)}},
|
|
922
|
+
"size": 1,
|
|
923
|
+
},
|
|
924
|
+
)
|
|
925
|
+
if search_response["hits"]["total"]["value"] == 0:
|
|
926
|
+
raise NotFoundError(
|
|
927
|
+
f"Item {item_id} does not exist inside Collection {collection_id}"
|
|
928
|
+
)
|
|
929
|
+
document_index = search_response["hits"]["hits"][0]["_index"]
|
|
930
|
+
await self.client.update(
|
|
931
|
+
index=document_index,
|
|
917
932
|
id=mk_item_id(item_id, collection_id),
|
|
918
933
|
body={"script": script},
|
|
919
934
|
refresh=True,
|
|
920
935
|
)
|
|
921
|
-
|
|
936
|
+
except exceptions.NotFoundError:
|
|
937
|
+
raise NotFoundError(
|
|
938
|
+
f"Item {item_id} does not exist inside Collection {collection_id}"
|
|
939
|
+
)
|
|
922
940
|
except exceptions.RequestError as exc:
|
|
923
941
|
raise HTTPException(
|
|
924
942
|
status_code=400, detail=exc.info["error"]["caused_by"]
|
|
@@ -937,8 +955,8 @@ class DatabaseLogic(BaseDatabaseLogic):
|
|
|
937
955
|
"script": {
|
|
938
956
|
"lang": "painless",
|
|
939
957
|
"source": (
|
|
940
|
-
f"""ctx._id = ctx._id.replace('{collection_id}', '{new_collection_id}');"""
|
|
941
|
-
f"""ctx._source.collection = '{new_collection_id}';"""
|
|
958
|
+
f"""ctx._id = ctx._id.replace('{collection_id}', '{new_collection_id}');""" # noqa: E702
|
|
959
|
+
f"""ctx._source.collection = '{new_collection_id}';""" # noqa: E702
|
|
942
960
|
),
|
|
943
961
|
},
|
|
944
962
|
},
|
|
@@ -992,9 +1010,9 @@ class DatabaseLogic(BaseDatabaseLogic):
|
|
|
992
1010
|
)
|
|
993
1011
|
|
|
994
1012
|
try:
|
|
995
|
-
await self.client.
|
|
1013
|
+
await self.client.delete_by_query(
|
|
996
1014
|
index=index_alias_by_collection_id(collection_id),
|
|
997
|
-
|
|
1015
|
+
body={"query": {"term": {"_id": mk_item_id(item_id, collection_id)}}},
|
|
998
1016
|
refresh=refresh,
|
|
999
1017
|
)
|
|
1000
1018
|
except exceptions.NotFoundError:
|
|
@@ -1085,8 +1103,10 @@ class DatabaseLogic(BaseDatabaseLogic):
|
|
|
1085
1103
|
body=collection,
|
|
1086
1104
|
refresh=refresh,
|
|
1087
1105
|
)
|
|
1088
|
-
|
|
1089
|
-
|
|
1106
|
+
if self.async_index_inserter.should_create_collection_index():
|
|
1107
|
+
await self.async_index_inserter.create_simple_index(
|
|
1108
|
+
self.client, collection_id
|
|
1109
|
+
)
|
|
1090
1110
|
|
|
1091
1111
|
async def find_collection(self, collection_id: str) -> Collection:
|
|
1092
1112
|
"""Find and return a collection from the database.
|
|
@@ -1295,6 +1315,7 @@ class DatabaseLogic(BaseDatabaseLogic):
|
|
|
1295
1315
|
await self.client.delete(
|
|
1296
1316
|
index=COLLECTIONS_INDEX, id=collection_id, refresh=refresh
|
|
1297
1317
|
)
|
|
1318
|
+
# Delete the item index for the collection
|
|
1298
1319
|
await delete_item_index(collection_id)
|
|
1299
1320
|
|
|
1300
1321
|
async def bulk_async(
|
|
@@ -1348,9 +1369,13 @@ class DatabaseLogic(BaseDatabaseLogic):
|
|
|
1348
1369
|
return 0, []
|
|
1349
1370
|
|
|
1350
1371
|
raise_on_error = self.async_settings.raise_on_bulk_error
|
|
1372
|
+
actions = await self.async_index_inserter.prepare_bulk_actions(
|
|
1373
|
+
collection_id, processed_items
|
|
1374
|
+
)
|
|
1375
|
+
|
|
1351
1376
|
success, errors = await helpers.async_bulk(
|
|
1352
1377
|
self.client,
|
|
1353
|
-
|
|
1378
|
+
actions,
|
|
1354
1379
|
refresh=refresh,
|
|
1355
1380
|
raise_on_error=raise_on_error,
|
|
1356
1381
|
)
|
|
@@ -1405,6 +1430,11 @@ class DatabaseLogic(BaseDatabaseLogic):
|
|
|
1405
1430
|
f"Performing bulk insert for collection {collection_id} with refresh={refresh}"
|
|
1406
1431
|
)
|
|
1407
1432
|
|
|
1433
|
+
# Handle empty processed_items
|
|
1434
|
+
if not processed_items:
|
|
1435
|
+
logger.warning(f"No items to insert for collection {collection_id}")
|
|
1436
|
+
return 0, []
|
|
1437
|
+
|
|
1408
1438
|
# Handle empty processed_items
|
|
1409
1439
|
if not processed_items:
|
|
1410
1440
|
logger.warning(f"No items to insert for collection {collection_id}")
|
{stac_fastapi_opensearch-6.0.0 → stac_fastapi_opensearch-6.2.0}/stac_fastapi/opensearch/version.py
RENAMED
|
@@ -1,2 +1,2 @@
|
|
|
1
1
|
"""library version."""
|
|
2
|
-
__version__ = "6.
|
|
2
|
+
__version__ = "6.2.0"
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.1
|
|
2
2
|
Name: stac-fastapi-opensearch
|
|
3
|
-
Version: 6.
|
|
3
|
+
Version: 6.2.0
|
|
4
4
|
Summary: Opensearch stac-fastapi backend.
|
|
5
5
|
Home-page: https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch
|
|
6
6
|
License: MIT
|
|
@@ -36,7 +36,7 @@ Provides-Extra: server
|
|
|
36
36
|
[](https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch/network/members)
|
|
37
37
|
[](https://pypi.org/project/stac-fastapi-elasticsearch/)
|
|
38
38
|
[](https://github.com/radiantearth/stac-spec/tree/v1.1.0)
|
|
39
|
-
[](https://github.com/stac-utils/stac-fastapi)
|
|
40
40
|
|
|
41
41
|
## Sponsors & Supporters
|
|
42
42
|
|
|
@@ -106,6 +106,7 @@ This project is built on the following technologies: STAC, stac-fastapi, FastAPI
|
|
|
106
106
|
- [Auth](#auth)
|
|
107
107
|
- [Aggregation](#aggregation)
|
|
108
108
|
- [Rate Limiting](#rate-limiting)
|
|
109
|
+
- [Datetime-Based Index Management](#datetime-based-index-management)
|
|
109
110
|
|
|
110
111
|
## Documentation & Resources
|
|
111
112
|
|
|
@@ -226,28 +227,105 @@ You can customize additional settings in your `.env` file:
|
|
|
226
227
|
|------------------------------|--------------------------------------------------------------------------------------|--------------------------|---------------------------------------------------------------------------------------------|
|
|
227
228
|
| `ES_HOST` | Hostname for external Elasticsearch/OpenSearch. | `localhost` | Optional |
|
|
228
229
|
| `ES_PORT` | Port for Elasticsearch/OpenSearch. | `9200` (ES) / `9202` (OS)| Optional |
|
|
229
|
-
| `ES_USE_SSL` | Use SSL for connecting to Elasticsearch/OpenSearch. | `
|
|
230
|
-
| `ES_VERIFY_CERTS` | Verify SSL certificates when connecting. | `
|
|
230
|
+
| `ES_USE_SSL` | Use SSL for connecting to Elasticsearch/OpenSearch. | `true` | Optional |
|
|
231
|
+
| `ES_VERIFY_CERTS` | Verify SSL certificates when connecting. | `true` | Optional |
|
|
232
|
+
| `ES_API_KEY` | API Key for external Elasticsearch/OpenSearch. | N/A | Optional |
|
|
233
|
+
| `ES_TIMEOUT` | Client timeout for Elasticsearch/OpenSearch. | DB client default | Optional |
|
|
231
234
|
| `STAC_FASTAPI_TITLE` | Title of the API in the documentation. | `stac-fastapi-<backend>` | Optional |
|
|
232
235
|
| `STAC_FASTAPI_DESCRIPTION` | Description of the API in the documentation. | N/A | Optional |
|
|
233
236
|
| `STAC_FASTAPI_VERSION` | API version. | `2.1` | Optional |
|
|
234
|
-
| `STAC_FASTAPI_LANDING_PAGE_ID` | Landing page ID
|
|
237
|
+
| `STAC_FASTAPI_LANDING_PAGE_ID` | Landing page ID | `stac-fastapi` | Optional |
|
|
235
238
|
| `APP_HOST` | Server bind address. | `0.0.0.0` | Optional |
|
|
236
|
-
| `APP_PORT` | Server port. | `
|
|
239
|
+
| `APP_PORT` | Server port. | `8000` | Optional |
|
|
237
240
|
| `ENVIRONMENT` | Runtime environment. | `local` | Optional |
|
|
238
241
|
| `WEB_CONCURRENCY` | Number of worker processes. | `10` | Optional |
|
|
239
242
|
| `RELOAD` | Enable auto-reload for development. | `true` | Optional |
|
|
240
243
|
| `STAC_FASTAPI_RATE_LIMIT` | API rate limit per client. | `200/minute` | Optional |
|
|
241
|
-
| `BACKEND` | Tests-related variable | `elasticsearch` or `opensearch` based on the backend | Optional
|
|
242
|
-
| `ELASTICSEARCH_VERSION`
|
|
243
|
-
| `OPENSEARCH_VERSION` | OpenSearch version | `2.11.1` | Optional
|
|
244
|
-
| `ENABLE_DIRECT_RESPONSE`
|
|
245
|
-
| `RAISE_ON_BULK_ERROR`
|
|
246
|
-
| `DATABASE_REFRESH`
|
|
244
|
+
| `BACKEND` | Tests-related variable | `elasticsearch` or `opensearch` based on the backend | Optional |
|
|
245
|
+
| `ELASTICSEARCH_VERSION` | Version of Elasticsearch to use. | `8.11.0` | Optional |
|
|
246
|
+
| `OPENSEARCH_VERSION` | OpenSearch version | `2.11.1` | Optional |
|
|
247
|
+
| `ENABLE_DIRECT_RESPONSE` | Enable direct response for maximum performance (disables all FastAPI dependencies, including authentication, custom status codes, and validation) | `false` | Optional |
|
|
248
|
+
| `RAISE_ON_BULK_ERROR` | Controls whether bulk insert operations raise exceptions on errors. If set to `true`, the operation will stop and raise an exception when an error occurs. If set to `false`, errors will be logged, and the operation will continue. **Note:** STAC Item and ItemCollection validation errors will always raise, regardless of this flag. | `false` | Optional |
|
|
249
|
+
| `DATABASE_REFRESH` | Controls whether database operations refresh the index immediately after changes. If set to `true`, changes will be immediately searchable. If set to `false`, changes may not be immediately visible but can improve performance for bulk operations. If set to `wait_for`, changes will wait for the next refresh cycle to become visible. | `false` | Optional |
|
|
247
250
|
| `ENABLE_TRANSACTIONS_EXTENSIONS` | Enables or disables the Transactions and Bulk Transactions API extensions. If set to `false`, the POST `/collections` route and related transaction endpoints (including bulk transaction operations) will be unavailable in the API. This is useful for deployments where mutating the catalog via the API should be prevented. | `true` | Optional |
|
|
248
251
|
|
|
249
252
|
> [!NOTE]
|
|
250
|
-
> The variables `ES_HOST`, `ES_PORT`, `ES_USE_SSL`, and `
|
|
253
|
+
> The variables `ES_HOST`, `ES_PORT`, `ES_USE_SSL`, `ES_VERIFY_CERTS` and `ES_TIMEOUT` apply to both Elasticsearch and OpenSearch backends, so there is no need to rename the key names to `OS_` even if you're using OpenSearch.
|
|
254
|
+
|
|
255
|
+
## Datetime-Based Index Management
|
|
256
|
+
|
|
257
|
+
### Overview
|
|
258
|
+
|
|
259
|
+
SFEOS supports two indexing strategies for managing STAC items:
|
|
260
|
+
|
|
261
|
+
1. **Simple Indexing** (default) - One index per collection
|
|
262
|
+
2. **Datetime-Based Indexing** - Time-partitioned indexes with automatic management
|
|
263
|
+
|
|
264
|
+
The datetime-based indexing strategy is particularly useful for large temporal datasets. When a user provides a datetime parameter in a query, the system knows exactly which index to search, providing **multiple times faster searches** and significantly **reducing database load**.
|
|
265
|
+
|
|
266
|
+
### When to Use
|
|
267
|
+
|
|
268
|
+
**Recommended for:**
|
|
269
|
+
- Systems with large collections containing millions of items
|
|
270
|
+
- Systems requiring high-performance temporal searching
|
|
271
|
+
|
|
272
|
+
**Pros:**
|
|
273
|
+
- Multiple times faster queries with datetime filter
|
|
274
|
+
- Reduced database load - only relevant indexes are searched
|
|
275
|
+
|
|
276
|
+
**Cons:**
|
|
277
|
+
- Slightly longer item indexing time (automatic index management)
|
|
278
|
+
- Greater management complexity
|
|
279
|
+
|
|
280
|
+
### Configuration
|
|
281
|
+
|
|
282
|
+
#### Enabling Datetime-Based Indexing
|
|
283
|
+
|
|
284
|
+
Enable datetime-based indexing by setting the following environment variable:
|
|
285
|
+
|
|
286
|
+
```bash
|
|
287
|
+
ENABLE_DATETIME_INDEX_FILTERING=true
|
|
288
|
+
```
|
|
289
|
+
|
|
290
|
+
### Related Configuration Variables
|
|
291
|
+
|
|
292
|
+
| Variable | Description | Default | Example |
|
|
293
|
+
|----------|-------------|---------|---------|
|
|
294
|
+
| `ENABLE_DATETIME_INDEX_FILTERING` | Enables time-based index partitioning | `false` | `true` |
|
|
295
|
+
| `DATETIME_INDEX_MAX_SIZE_GB` | Maximum size limit for datetime indexes (GB) - note: add +20% to target size due to ES/OS compression | `25` | `50` |
|
|
296
|
+
| `STAC_ITEMS_INDEX_PREFIX` | Prefix for item indexes | `items_` | `stac_items_` |
|
|
297
|
+
|
|
298
|
+
## How Datetime-Based Indexing Works
|
|
299
|
+
|
|
300
|
+
### Index and Alias Naming Convention
|
|
301
|
+
|
|
302
|
+
The system uses a precise naming convention:
|
|
303
|
+
|
|
304
|
+
**Physical indexes:**
|
|
305
|
+
```
|
|
306
|
+
{ITEMS_INDEX_PREFIX}{collection-id}_{uuid4}
|
|
307
|
+
```
|
|
308
|
+
|
|
309
|
+
**Aliases:**
|
|
310
|
+
```
|
|
311
|
+
{ITEMS_INDEX_PREFIX}{collection-id} # Main collection alias
|
|
312
|
+
{ITEMS_INDEX_PREFIX}{collection-id}_{start-datetime} # Temporal alias
|
|
313
|
+
{ITEMS_INDEX_PREFIX}{collection-id}_{start-datetime}_{end-datetime} # Closed index alias
|
|
314
|
+
```
|
|
315
|
+
|
|
316
|
+
**Example:**
|
|
317
|
+
|
|
318
|
+
*Physical indexes:*
|
|
319
|
+
- `items_sentinel-2-l2a_a1b2c3d4-e5f6-7890-abcd-ef1234567890`
|
|
320
|
+
|
|
321
|
+
*Aliases:*
|
|
322
|
+
- `items_sentinel-2-l2a` - main collection alias
|
|
323
|
+
- `items_sentinel-2-l2a_2024-01-01` - active alias from January 1, 2024
|
|
324
|
+
- `items_sentinel-2-l2a_2024-01-01_2024-03-15` - closed index alias (reached size limit)
|
|
325
|
+
|
|
326
|
+
### Index Size Management
|
|
327
|
+
|
|
328
|
+
**Important - Data Compression:** Elasticsearch and OpenSearch automatically compress data. The configured `DATETIME_INDEX_MAX_SIZE_GB` limit refers to the compressed size on disk. It is recommended to add +20% to the target size to account for compression overhead and metadata.
|
|
251
329
|
|
|
252
330
|
## Interacting with the API
|
|
253
331
|
|
|
@@ -557,4 +635,3 @@ You can customize additional settings in your `.env` file:
|
|
|
557
635
|
- Ensures fair resource allocation among all clients
|
|
558
636
|
|
|
559
637
|
- **Examples**: Implementation examples are available in the [examples/rate_limit](examples/rate_limit) directory.
|
|
560
|
-
|
|
File without changes
|
{stac_fastapi_opensearch-6.0.0 → stac_fastapi_opensearch-6.2.0}/stac_fastapi/opensearch/__init__.py
RENAMED
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|