stac-fastapi-opensearch 6.1.0__tar.gz → 6.2.1__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- {stac_fastapi_opensearch-6.1.0/stac_fastapi_opensearch.egg-info → stac_fastapi_opensearch-6.2.1}/PKG-INFO +105 -4
- stac_fastapi_opensearch-6.1.0/PKG-INFO → stac_fastapi_opensearch-6.2.1/README.md +77 -22
- {stac_fastapi_opensearch-6.1.0 → stac_fastapi_opensearch-6.2.1}/setup.py +2 -2
- {stac_fastapi_opensearch-6.1.0 → stac_fastapi_opensearch-6.2.1}/stac_fastapi/opensearch/app.py +1 -1
- {stac_fastapi_opensearch-6.1.0 → stac_fastapi_opensearch-6.2.1}/stac_fastapi/opensearch/config.py +1 -0
- {stac_fastapi_opensearch-6.1.0 → stac_fastapi_opensearch-6.2.1}/stac_fastapi/opensearch/database_logic.py +95 -69
- {stac_fastapi_opensearch-6.1.0 → stac_fastapi_opensearch-6.2.1}/stac_fastapi/opensearch/version.py +1 -1
- stac_fastapi_opensearch-6.1.0/README.md → stac_fastapi_opensearch-6.2.1/stac_fastapi_opensearch.egg-info/PKG-INFO +123 -1
- {stac_fastapi_opensearch-6.1.0 → stac_fastapi_opensearch-6.2.1}/stac_fastapi_opensearch.egg-info/requires.txt +2 -2
- {stac_fastapi_opensearch-6.1.0 → stac_fastapi_opensearch-6.2.1}/stac_fastapi_opensearch.egg-info/top_level.txt +1 -0
- {stac_fastapi_opensearch-6.1.0 → stac_fastapi_opensearch-6.2.1}/setup.cfg +0 -0
- {stac_fastapi_opensearch-6.1.0 → stac_fastapi_opensearch-6.2.1}/stac_fastapi/opensearch/__init__.py +0 -0
- {stac_fastapi_opensearch-6.1.0 → stac_fastapi_opensearch-6.2.1}/stac_fastapi_opensearch.egg-info/SOURCES.txt +0 -0
- {stac_fastapi_opensearch-6.1.0 → stac_fastapi_opensearch-6.2.1}/stac_fastapi_opensearch.egg-info/dependency_links.txt +0 -0
- {stac_fastapi_opensearch-6.1.0 → stac_fastapi_opensearch-6.2.1}/stac_fastapi_opensearch.egg-info/entry_points.txt +0 -0
- {stac_fastapi_opensearch-6.1.0 → stac_fastapi_opensearch-6.2.1}/stac_fastapi_opensearch.egg-info/not-zip-safe +0 -0
|
@@ -1,6 +1,6 @@
|
|
|
1
|
-
Metadata-Version: 2.
|
|
2
|
-
Name:
|
|
3
|
-
Version: 6.1
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: stac_fastapi_opensearch
|
|
3
|
+
Version: 6.2.1
|
|
4
4
|
Summary: Opensearch stac-fastapi backend.
|
|
5
5
|
Home-page: https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch
|
|
6
6
|
License: MIT
|
|
@@ -15,9 +15,34 @@ Classifier: Programming Language :: Python :: 3.13
|
|
|
15
15
|
Classifier: License :: OSI Approved :: MIT License
|
|
16
16
|
Requires-Python: >=3.9
|
|
17
17
|
Description-Content-Type: text/markdown
|
|
18
|
+
Requires-Dist: stac-fastapi-core==6.2.1
|
|
19
|
+
Requires-Dist: sfeos-helpers==6.2.1
|
|
20
|
+
Requires-Dist: opensearch-py~=2.8.0
|
|
21
|
+
Requires-Dist: opensearch-py[async]~=2.8.0
|
|
22
|
+
Requires-Dist: uvicorn~=0.23.0
|
|
23
|
+
Requires-Dist: starlette<0.36.0,>=0.35.0
|
|
18
24
|
Provides-Extra: dev
|
|
25
|
+
Requires-Dist: pytest~=7.0.0; extra == "dev"
|
|
26
|
+
Requires-Dist: pytest-cov~=4.0.0; extra == "dev"
|
|
27
|
+
Requires-Dist: pytest-asyncio~=0.21.0; extra == "dev"
|
|
28
|
+
Requires-Dist: pre-commit~=3.0.0; extra == "dev"
|
|
29
|
+
Requires-Dist: ciso8601~=2.3.0; extra == "dev"
|
|
30
|
+
Requires-Dist: httpx<0.28.0,>=0.24.0; extra == "dev"
|
|
19
31
|
Provides-Extra: docs
|
|
32
|
+
Requires-Dist: mkdocs~=1.4.0; extra == "docs"
|
|
33
|
+
Requires-Dist: mkdocs-material~=9.0.0; extra == "docs"
|
|
34
|
+
Requires-Dist: pdocs~=1.2.0; extra == "docs"
|
|
20
35
|
Provides-Extra: server
|
|
36
|
+
Requires-Dist: uvicorn[standard]~=0.23.0; extra == "server"
|
|
37
|
+
Dynamic: classifier
|
|
38
|
+
Dynamic: description
|
|
39
|
+
Dynamic: description-content-type
|
|
40
|
+
Dynamic: home-page
|
|
41
|
+
Dynamic: license
|
|
42
|
+
Dynamic: provides-extra
|
|
43
|
+
Dynamic: requires-dist
|
|
44
|
+
Dynamic: requires-python
|
|
45
|
+
Dynamic: summary
|
|
21
46
|
|
|
22
47
|
# stac-fastapi-elasticsearch-opensearch
|
|
23
48
|
|
|
@@ -106,6 +131,7 @@ This project is built on the following technologies: STAC, stac-fastapi, FastAPI
|
|
|
106
131
|
- [Auth](#auth)
|
|
107
132
|
- [Aggregation](#aggregation)
|
|
108
133
|
- [Rate Limiting](#rate-limiting)
|
|
134
|
+
- [Datetime-Based Index Management](#datetime-based-index-management)
|
|
109
135
|
|
|
110
136
|
## Documentation & Resources
|
|
111
137
|
|
|
@@ -247,10 +273,86 @@ You can customize additional settings in your `.env` file:
|
|
|
247
273
|
| `RAISE_ON_BULK_ERROR` | Controls whether bulk insert operations raise exceptions on errors. If set to `true`, the operation will stop and raise an exception when an error occurs. If set to `false`, errors will be logged, and the operation will continue. **Note:** STAC Item and ItemCollection validation errors will always raise, regardless of this flag. | `false` | Optional |
|
|
248
274
|
| `DATABASE_REFRESH` | Controls whether database operations refresh the index immediately after changes. If set to `true`, changes will be immediately searchable. If set to `false`, changes may not be immediately visible but can improve performance for bulk operations. If set to `wait_for`, changes will wait for the next refresh cycle to become visible. | `false` | Optional |
|
|
249
275
|
| `ENABLE_TRANSACTIONS_EXTENSIONS` | Enables or disables the Transactions and Bulk Transactions API extensions. If set to `false`, the POST `/collections` route and related transaction endpoints (including bulk transaction operations) will be unavailable in the API. This is useful for deployments where mutating the catalog via the API should be prevented. | `true` | Optional |
|
|
276
|
+
| `STAC_ITEM_LIMIT` | Sets the environment variable for result limiting to SFEOS for the number of returned items and STAC collections. | `10` | Optional |
|
|
250
277
|
|
|
251
278
|
> [!NOTE]
|
|
252
279
|
> The variables `ES_HOST`, `ES_PORT`, `ES_USE_SSL`, `ES_VERIFY_CERTS` and `ES_TIMEOUT` apply to both Elasticsearch and OpenSearch backends, so there is no need to rename the key names to `OS_` even if you're using OpenSearch.
|
|
253
280
|
|
|
281
|
+
## Datetime-Based Index Management
|
|
282
|
+
|
|
283
|
+
### Overview
|
|
284
|
+
|
|
285
|
+
SFEOS supports two indexing strategies for managing STAC items:
|
|
286
|
+
|
|
287
|
+
1. **Simple Indexing** (default) - One index per collection
|
|
288
|
+
2. **Datetime-Based Indexing** - Time-partitioned indexes with automatic management
|
|
289
|
+
|
|
290
|
+
The datetime-based indexing strategy is particularly useful for large temporal datasets. When a user provides a datetime parameter in a query, the system knows exactly which index to search, providing **multiple times faster searches** and significantly **reducing database load**.
|
|
291
|
+
|
|
292
|
+
### When to Use
|
|
293
|
+
|
|
294
|
+
**Recommended for:**
|
|
295
|
+
- Systems with large collections containing millions of items
|
|
296
|
+
- Systems requiring high-performance temporal searching
|
|
297
|
+
|
|
298
|
+
**Pros:**
|
|
299
|
+
- Multiple times faster queries with datetime filter
|
|
300
|
+
- Reduced database load - only relevant indexes are searched
|
|
301
|
+
|
|
302
|
+
**Cons:**
|
|
303
|
+
- Slightly longer item indexing time (automatic index management)
|
|
304
|
+
- Greater management complexity
|
|
305
|
+
|
|
306
|
+
### Configuration
|
|
307
|
+
|
|
308
|
+
#### Enabling Datetime-Based Indexing
|
|
309
|
+
|
|
310
|
+
Enable datetime-based indexing by setting the following environment variable:
|
|
311
|
+
|
|
312
|
+
```bash
|
|
313
|
+
ENABLE_DATETIME_INDEX_FILTERING=true
|
|
314
|
+
```
|
|
315
|
+
|
|
316
|
+
### Related Configuration Variables
|
|
317
|
+
|
|
318
|
+
| Variable | Description | Default | Example |
|
|
319
|
+
|----------|-------------|---------|---------|
|
|
320
|
+
| `ENABLE_DATETIME_INDEX_FILTERING` | Enables time-based index partitioning | `false` | `true` |
|
|
321
|
+
| `DATETIME_INDEX_MAX_SIZE_GB` | Maximum size limit for datetime indexes (GB) - note: add +20% to target size due to ES/OS compression | `25` | `50` |
|
|
322
|
+
| `STAC_ITEMS_INDEX_PREFIX` | Prefix for item indexes | `items_` | `stac_items_` |
|
|
323
|
+
|
|
324
|
+
## How Datetime-Based Indexing Works
|
|
325
|
+
|
|
326
|
+
### Index and Alias Naming Convention
|
|
327
|
+
|
|
328
|
+
The system uses a precise naming convention:
|
|
329
|
+
|
|
330
|
+
**Physical indexes:**
|
|
331
|
+
```
|
|
332
|
+
{ITEMS_INDEX_PREFIX}{collection-id}_{uuid4}
|
|
333
|
+
```
|
|
334
|
+
|
|
335
|
+
**Aliases:**
|
|
336
|
+
```
|
|
337
|
+
{ITEMS_INDEX_PREFIX}{collection-id} # Main collection alias
|
|
338
|
+
{ITEMS_INDEX_PREFIX}{collection-id}_{start-datetime} # Temporal alias
|
|
339
|
+
{ITEMS_INDEX_PREFIX}{collection-id}_{start-datetime}_{end-datetime} # Closed index alias
|
|
340
|
+
```
|
|
341
|
+
|
|
342
|
+
**Example:**
|
|
343
|
+
|
|
344
|
+
*Physical indexes:*
|
|
345
|
+
- `items_sentinel-2-l2a_a1b2c3d4-e5f6-7890-abcd-ef1234567890`
|
|
346
|
+
|
|
347
|
+
*Aliases:*
|
|
348
|
+
- `items_sentinel-2-l2a` - main collection alias
|
|
349
|
+
- `items_sentinel-2-l2a_2024-01-01` - active alias from January 1, 2024
|
|
350
|
+
- `items_sentinel-2-l2a_2024-01-01_2024-03-15` - closed index alias (reached size limit)
|
|
351
|
+
|
|
352
|
+
### Index Size Management
|
|
353
|
+
|
|
354
|
+
**Important - Data Compression:** Elasticsearch and OpenSearch automatically compress data. The configured `DATETIME_INDEX_MAX_SIZE_GB` limit refers to the compressed size on disk. It is recommended to add +20% to the target size to account for compression overhead and metadata.
|
|
355
|
+
|
|
254
356
|
## Interacting with the API
|
|
255
357
|
|
|
256
358
|
- **Creating a Collection**:
|
|
@@ -559,4 +661,3 @@ You can customize additional settings in your `.env` file:
|
|
|
559
661
|
- Ensures fair resource allocation among all clients
|
|
560
662
|
|
|
561
663
|
- **Examples**: Implementation examples are available in the [examples/rate_limit](examples/rate_limit) directory.
|
|
562
|
-
|
|
@@ -1,24 +1,3 @@
|
|
|
1
|
-
Metadata-Version: 2.1
|
|
2
|
-
Name: stac_fastapi_opensearch
|
|
3
|
-
Version: 6.1.0
|
|
4
|
-
Summary: Opensearch stac-fastapi backend.
|
|
5
|
-
Home-page: https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch
|
|
6
|
-
License: MIT
|
|
7
|
-
Classifier: Intended Audience :: Developers
|
|
8
|
-
Classifier: Intended Audience :: Information Technology
|
|
9
|
-
Classifier: Intended Audience :: Science/Research
|
|
10
|
-
Classifier: Programming Language :: Python :: 3.9
|
|
11
|
-
Classifier: Programming Language :: Python :: 3.10
|
|
12
|
-
Classifier: Programming Language :: Python :: 3.11
|
|
13
|
-
Classifier: Programming Language :: Python :: 3.12
|
|
14
|
-
Classifier: Programming Language :: Python :: 3.13
|
|
15
|
-
Classifier: License :: OSI Approved :: MIT License
|
|
16
|
-
Requires-Python: >=3.9
|
|
17
|
-
Description-Content-Type: text/markdown
|
|
18
|
-
Provides-Extra: dev
|
|
19
|
-
Provides-Extra: docs
|
|
20
|
-
Provides-Extra: server
|
|
21
|
-
|
|
22
1
|
# stac-fastapi-elasticsearch-opensearch
|
|
23
2
|
|
|
24
3
|
<!-- markdownlint-disable MD033 MD041 -->
|
|
@@ -106,6 +85,7 @@ This project is built on the following technologies: STAC, stac-fastapi, FastAPI
|
|
|
106
85
|
- [Auth](#auth)
|
|
107
86
|
- [Aggregation](#aggregation)
|
|
108
87
|
- [Rate Limiting](#rate-limiting)
|
|
88
|
+
- [Datetime-Based Index Management](#datetime-based-index-management)
|
|
109
89
|
|
|
110
90
|
## Documentation & Resources
|
|
111
91
|
|
|
@@ -247,10 +227,86 @@ You can customize additional settings in your `.env` file:
|
|
|
247
227
|
| `RAISE_ON_BULK_ERROR` | Controls whether bulk insert operations raise exceptions on errors. If set to `true`, the operation will stop and raise an exception when an error occurs. If set to `false`, errors will be logged, and the operation will continue. **Note:** STAC Item and ItemCollection validation errors will always raise, regardless of this flag. | `false` | Optional |
|
|
248
228
|
| `DATABASE_REFRESH` | Controls whether database operations refresh the index immediately after changes. If set to `true`, changes will be immediately searchable. If set to `false`, changes may not be immediately visible but can improve performance for bulk operations. If set to `wait_for`, changes will wait for the next refresh cycle to become visible. | `false` | Optional |
|
|
249
229
|
| `ENABLE_TRANSACTIONS_EXTENSIONS` | Enables or disables the Transactions and Bulk Transactions API extensions. If set to `false`, the POST `/collections` route and related transaction endpoints (including bulk transaction operations) will be unavailable in the API. This is useful for deployments where mutating the catalog via the API should be prevented. | `true` | Optional |
|
|
230
|
+
| `STAC_ITEM_LIMIT` | Sets the environment variable for result limiting to SFEOS for the number of returned items and STAC collections. | `10` | Optional |
|
|
250
231
|
|
|
251
232
|
> [!NOTE]
|
|
252
233
|
> The variables `ES_HOST`, `ES_PORT`, `ES_USE_SSL`, `ES_VERIFY_CERTS` and `ES_TIMEOUT` apply to both Elasticsearch and OpenSearch backends, so there is no need to rename the key names to `OS_` even if you're using OpenSearch.
|
|
253
234
|
|
|
235
|
+
## Datetime-Based Index Management
|
|
236
|
+
|
|
237
|
+
### Overview
|
|
238
|
+
|
|
239
|
+
SFEOS supports two indexing strategies for managing STAC items:
|
|
240
|
+
|
|
241
|
+
1. **Simple Indexing** (default) - One index per collection
|
|
242
|
+
2. **Datetime-Based Indexing** - Time-partitioned indexes with automatic management
|
|
243
|
+
|
|
244
|
+
The datetime-based indexing strategy is particularly useful for large temporal datasets. When a user provides a datetime parameter in a query, the system knows exactly which index to search, providing **multiple times faster searches** and significantly **reducing database load**.
|
|
245
|
+
|
|
246
|
+
### When to Use
|
|
247
|
+
|
|
248
|
+
**Recommended for:**
|
|
249
|
+
- Systems with large collections containing millions of items
|
|
250
|
+
- Systems requiring high-performance temporal searching
|
|
251
|
+
|
|
252
|
+
**Pros:**
|
|
253
|
+
- Multiple times faster queries with datetime filter
|
|
254
|
+
- Reduced database load - only relevant indexes are searched
|
|
255
|
+
|
|
256
|
+
**Cons:**
|
|
257
|
+
- Slightly longer item indexing time (automatic index management)
|
|
258
|
+
- Greater management complexity
|
|
259
|
+
|
|
260
|
+
### Configuration
|
|
261
|
+
|
|
262
|
+
#### Enabling Datetime-Based Indexing
|
|
263
|
+
|
|
264
|
+
Enable datetime-based indexing by setting the following environment variable:
|
|
265
|
+
|
|
266
|
+
```bash
|
|
267
|
+
ENABLE_DATETIME_INDEX_FILTERING=true
|
|
268
|
+
```
|
|
269
|
+
|
|
270
|
+
### Related Configuration Variables
|
|
271
|
+
|
|
272
|
+
| Variable | Description | Default | Example |
|
|
273
|
+
|----------|-------------|---------|---------|
|
|
274
|
+
| `ENABLE_DATETIME_INDEX_FILTERING` | Enables time-based index partitioning | `false` | `true` |
|
|
275
|
+
| `DATETIME_INDEX_MAX_SIZE_GB` | Maximum size limit for datetime indexes (GB) - note: add +20% to target size due to ES/OS compression | `25` | `50` |
|
|
276
|
+
| `STAC_ITEMS_INDEX_PREFIX` | Prefix for item indexes | `items_` | `stac_items_` |
|
|
277
|
+
|
|
278
|
+
## How Datetime-Based Indexing Works
|
|
279
|
+
|
|
280
|
+
### Index and Alias Naming Convention
|
|
281
|
+
|
|
282
|
+
The system uses a precise naming convention:
|
|
283
|
+
|
|
284
|
+
**Physical indexes:**
|
|
285
|
+
```
|
|
286
|
+
{ITEMS_INDEX_PREFIX}{collection-id}_{uuid4}
|
|
287
|
+
```
|
|
288
|
+
|
|
289
|
+
**Aliases:**
|
|
290
|
+
```
|
|
291
|
+
{ITEMS_INDEX_PREFIX}{collection-id} # Main collection alias
|
|
292
|
+
{ITEMS_INDEX_PREFIX}{collection-id}_{start-datetime} # Temporal alias
|
|
293
|
+
{ITEMS_INDEX_PREFIX}{collection-id}_{start-datetime}_{end-datetime} # Closed index alias
|
|
294
|
+
```
|
|
295
|
+
|
|
296
|
+
**Example:**
|
|
297
|
+
|
|
298
|
+
*Physical indexes:*
|
|
299
|
+
- `items_sentinel-2-l2a_a1b2c3d4-e5f6-7890-abcd-ef1234567890`
|
|
300
|
+
|
|
301
|
+
*Aliases:*
|
|
302
|
+
- `items_sentinel-2-l2a` - main collection alias
|
|
303
|
+
- `items_sentinel-2-l2a_2024-01-01` - active alias from January 1, 2024
|
|
304
|
+
- `items_sentinel-2-l2a_2024-01-01_2024-03-15` - closed index alias (reached size limit)
|
|
305
|
+
|
|
306
|
+
### Index Size Management
|
|
307
|
+
|
|
308
|
+
**Important - Data Compression:** Elasticsearch and OpenSearch automatically compress data. The configured `DATETIME_INDEX_MAX_SIZE_GB` limit refers to the compressed size on disk. It is recommended to add +20% to the target size to account for compression overhead and metadata.
|
|
309
|
+
|
|
254
310
|
## Interacting with the API
|
|
255
311
|
|
|
256
312
|
- **Creating a Collection**:
|
|
@@ -559,4 +615,3 @@ You can customize additional settings in your `.env` file:
|
|
|
559
615
|
- Ensures fair resource allocation among all clients
|
|
560
616
|
|
|
561
617
|
- **Examples**: Implementation examples are available in the [examples/rate_limit](examples/rate_limit) directory.
|
|
562
|
-
|
|
@@ -6,8 +6,8 @@ with open("README.md") as f:
|
|
|
6
6
|
desc = f.read()
|
|
7
7
|
|
|
8
8
|
install_requires = [
|
|
9
|
-
"stac-fastapi-core==6.1
|
|
10
|
-
"sfeos-helpers==6.1
|
|
9
|
+
"stac-fastapi-core==6.2.1",
|
|
10
|
+
"sfeos-helpers==6.2.1",
|
|
11
11
|
"opensearch-py~=2.8.0",
|
|
12
12
|
"opensearch-py[async]~=2.8.0",
|
|
13
13
|
"uvicorn~=0.23.0",
|
{stac_fastapi_opensearch-6.1.0 → stac_fastapi_opensearch-6.2.1}/stac_fastapi/opensearch/app.py
RENAMED
|
@@ -118,7 +118,7 @@ post_request_model = create_post_request_model(search_extensions)
|
|
|
118
118
|
app_config = {
|
|
119
119
|
"title": os.getenv("STAC_FASTAPI_TITLE", "stac-fastapi-opensearch"),
|
|
120
120
|
"description": os.getenv("STAC_FASTAPI_DESCRIPTION", "stac-fastapi-opensearch"),
|
|
121
|
-
"api_version": os.getenv("STAC_FASTAPI_VERSION", "6.1
|
|
121
|
+
"api_version": os.getenv("STAC_FASTAPI_VERSION", "6.2.1"),
|
|
122
122
|
"settings": settings,
|
|
123
123
|
"extensions": extensions,
|
|
124
124
|
"client": CoreClient(
|
|
@@ -4,7 +4,7 @@ import asyncio
|
|
|
4
4
|
import logging
|
|
5
5
|
from base64 import urlsafe_b64decode, urlsafe_b64encode
|
|
6
6
|
from copy import deepcopy
|
|
7
|
-
from typing import Any, Dict, Iterable, List, Optional, Tuple, Type
|
|
7
|
+
from typing import Any, Dict, Iterable, List, Optional, Tuple, Type
|
|
8
8
|
|
|
9
9
|
import attr
|
|
10
10
|
import orjson
|
|
@@ -26,7 +26,7 @@ from stac_fastapi.opensearch.config import (
|
|
|
26
26
|
AsyncOpensearchSettings as AsyncSearchSettings,
|
|
27
27
|
)
|
|
28
28
|
from stac_fastapi.opensearch.config import OpensearchSettings as SyncSearchSettings
|
|
29
|
-
from stac_fastapi.sfeos_helpers import filter
|
|
29
|
+
from stac_fastapi.sfeos_helpers import filter as filter_module
|
|
30
30
|
from stac_fastapi.sfeos_helpers.database import (
|
|
31
31
|
apply_free_text_filter_shared,
|
|
32
32
|
apply_intersects_filter_shared,
|
|
@@ -34,8 +34,6 @@ from stac_fastapi.sfeos_helpers.database import (
|
|
|
34
34
|
delete_item_index_shared,
|
|
35
35
|
get_queryables_mapping_shared,
|
|
36
36
|
index_alias_by_collection_id,
|
|
37
|
-
index_by_collection_id,
|
|
38
|
-
indices,
|
|
39
37
|
mk_actions,
|
|
40
38
|
mk_item_id,
|
|
41
39
|
populate_sort_shared,
|
|
@@ -55,15 +53,18 @@ from stac_fastapi.sfeos_helpers.mappings import (
|
|
|
55
53
|
COLLECTIONS_INDEX,
|
|
56
54
|
DEFAULT_SORT,
|
|
57
55
|
ES_COLLECTIONS_MAPPINGS,
|
|
58
|
-
ES_ITEMS_MAPPINGS,
|
|
59
|
-
ES_ITEMS_SETTINGS,
|
|
60
56
|
ITEM_INDICES,
|
|
61
57
|
ITEMS_INDEX_PREFIX,
|
|
62
58
|
Geometry,
|
|
63
59
|
)
|
|
60
|
+
from stac_fastapi.sfeos_helpers.search_engine import (
|
|
61
|
+
BaseIndexInserter,
|
|
62
|
+
BaseIndexSelector,
|
|
63
|
+
IndexInsertionFactory,
|
|
64
|
+
IndexSelectorFactory,
|
|
65
|
+
)
|
|
64
66
|
from stac_fastapi.types.errors import ConflictError, NotFoundError
|
|
65
67
|
from stac_fastapi.types.links import resolve_links
|
|
66
|
-
from stac_fastapi.types.rfc3339 import DateTimeType
|
|
67
68
|
from stac_fastapi.types.stac import Collection, Item
|
|
68
69
|
|
|
69
70
|
logger = logging.getLogger(__name__)
|
|
@@ -104,33 +105,6 @@ async def create_collection_index() -> None:
|
|
|
104
105
|
await client.close()
|
|
105
106
|
|
|
106
107
|
|
|
107
|
-
async def create_item_index(collection_id: str) -> None:
|
|
108
|
-
"""
|
|
109
|
-
Create the index for Items. The settings of the index template will be used implicitly.
|
|
110
|
-
|
|
111
|
-
Args:
|
|
112
|
-
collection_id (str): Collection identifier.
|
|
113
|
-
|
|
114
|
-
Returns:
|
|
115
|
-
None
|
|
116
|
-
|
|
117
|
-
"""
|
|
118
|
-
client = AsyncSearchSettings().create_client
|
|
119
|
-
|
|
120
|
-
index_name = f"{index_by_collection_id(collection_id)}-000001"
|
|
121
|
-
exists = await client.indices.exists(index=index_name)
|
|
122
|
-
if not exists:
|
|
123
|
-
await client.indices.create(
|
|
124
|
-
index=index_name,
|
|
125
|
-
body={
|
|
126
|
-
"aliases": {index_alias_by_collection_id(collection_id): {}},
|
|
127
|
-
"mappings": ES_ITEMS_MAPPINGS,
|
|
128
|
-
"settings": ES_ITEMS_SETTINGS,
|
|
129
|
-
},
|
|
130
|
-
)
|
|
131
|
-
await client.close()
|
|
132
|
-
|
|
133
|
-
|
|
134
108
|
async def delete_item_index(collection_id: str) -> None:
|
|
135
109
|
"""Delete the index for items in a collection.
|
|
136
110
|
|
|
@@ -152,6 +126,9 @@ class DatabaseLogic(BaseDatabaseLogic):
|
|
|
152
126
|
async_settings: AsyncSearchSettings = attr.ib(factory=AsyncSearchSettings)
|
|
153
127
|
sync_settings: SyncSearchSettings = attr.ib(factory=SyncSearchSettings)
|
|
154
128
|
|
|
129
|
+
async_index_selector: BaseIndexSelector = attr.ib(init=False)
|
|
130
|
+
async_index_inserter: BaseIndexInserter = attr.ib(init=False)
|
|
131
|
+
|
|
155
132
|
client = attr.ib(init=False)
|
|
156
133
|
sync_client = attr.ib(init=False)
|
|
157
134
|
|
|
@@ -159,6 +136,10 @@ class DatabaseLogic(BaseDatabaseLogic):
|
|
|
159
136
|
"""Initialize clients after the class is instantiated."""
|
|
160
137
|
self.client = self.async_settings.create_client
|
|
161
138
|
self.sync_client = self.sync_settings.create_client
|
|
139
|
+
self.async_index_inserter = IndexInsertionFactory.create_insertion_strategy(
|
|
140
|
+
self.client
|
|
141
|
+
)
|
|
142
|
+
self.async_index_selector = IndexSelectorFactory.create_selector(self.client)
|
|
162
143
|
|
|
163
144
|
item_serializer: Type[ItemSerializer] = attr.ib(default=ItemSerializer)
|
|
164
145
|
collection_serializer: Type[CollectionSerializer] = attr.ib(
|
|
@@ -234,15 +215,23 @@ class DatabaseLogic(BaseDatabaseLogic):
|
|
|
234
215
|
with the index for the Collection as the target index and the combined `mk_item_id` as the document id.
|
|
235
216
|
"""
|
|
236
217
|
try:
|
|
237
|
-
|
|
218
|
+
response = await self.client.search(
|
|
238
219
|
index=index_alias_by_collection_id(collection_id),
|
|
239
|
-
|
|
220
|
+
body={
|
|
221
|
+
"query": {"term": {"_id": mk_item_id(item_id, collection_id)}},
|
|
222
|
+
"size": 1,
|
|
223
|
+
},
|
|
240
224
|
)
|
|
225
|
+
if response["hits"]["total"]["value"] == 0:
|
|
226
|
+
raise NotFoundError(
|
|
227
|
+
f"Item {item_id} does not exist inside Collection {collection_id}"
|
|
228
|
+
)
|
|
229
|
+
|
|
230
|
+
return response["hits"]["hits"][0]["_source"]
|
|
241
231
|
except exceptions.NotFoundError:
|
|
242
232
|
raise NotFoundError(
|
|
243
233
|
f"Item {item_id} does not exist inside Collection {collection_id}"
|
|
244
234
|
)
|
|
245
|
-
return item["_source"]
|
|
246
235
|
|
|
247
236
|
async def get_queryables_mapping(self, collection_id: str = "*") -> dict:
|
|
248
237
|
"""Retrieve mapping of Queryables for search.
|
|
@@ -296,31 +285,21 @@ class DatabaseLogic(BaseDatabaseLogic):
|
|
|
296
285
|
|
|
297
286
|
@staticmethod
|
|
298
287
|
def apply_datetime_filter(
|
|
299
|
-
search: Search,
|
|
300
|
-
) -> Search:
|
|
288
|
+
search: Search, datetime: Optional[str]
|
|
289
|
+
) -> Tuple[Search, Dict[str, Optional[str]]]:
|
|
301
290
|
"""Apply a filter to search on datetime, start_datetime, and end_datetime fields.
|
|
302
291
|
|
|
303
292
|
Args:
|
|
304
293
|
search: The search object to filter.
|
|
305
|
-
|
|
306
|
-
- A single datetime string (e.g., "2023-01-01T12:00:00")
|
|
307
|
-
- A datetime range string (e.g., "2023-01-01/2023-12-31")
|
|
308
|
-
- A datetime object
|
|
309
|
-
- A tuple of (start_datetime, end_datetime)
|
|
294
|
+
datetime: Optional[str]
|
|
310
295
|
|
|
311
296
|
Returns:
|
|
312
297
|
The filtered search object.
|
|
313
298
|
"""
|
|
314
|
-
|
|
315
|
-
return search
|
|
299
|
+
datetime_search = return_date(datetime)
|
|
316
300
|
|
|
317
|
-
|
|
318
|
-
|
|
319
|
-
datetime_search = return_date(interval)
|
|
320
|
-
except (ValueError, TypeError) as e:
|
|
321
|
-
# Handle invalid interval formats if return_date fails
|
|
322
|
-
logger.error(f"Invalid interval format: {interval}, error: {e}")
|
|
323
|
-
return search
|
|
301
|
+
if not datetime_search:
|
|
302
|
+
return search, datetime_search
|
|
324
303
|
|
|
325
304
|
if "eq" in datetime_search:
|
|
326
305
|
# For exact matches, include:
|
|
@@ -387,7 +366,10 @@ class DatabaseLogic(BaseDatabaseLogic):
|
|
|
387
366
|
),
|
|
388
367
|
]
|
|
389
368
|
|
|
390
|
-
return
|
|
369
|
+
return (
|
|
370
|
+
search.query(Q("bool", should=should, minimum_should_match=1)),
|
|
371
|
+
datetime_search,
|
|
372
|
+
)
|
|
391
373
|
|
|
392
374
|
@staticmethod
|
|
393
375
|
def apply_bbox_filter(search: Search, bbox: List):
|
|
@@ -484,7 +466,7 @@ class DatabaseLogic(BaseDatabaseLogic):
|
|
|
484
466
|
otherwise the original Search object.
|
|
485
467
|
"""
|
|
486
468
|
if _filter is not None:
|
|
487
|
-
es_query =
|
|
469
|
+
es_query = filter_module.to_es(await self.get_queryables_mapping(), _filter)
|
|
488
470
|
search = search.filter(es_query)
|
|
489
471
|
|
|
490
472
|
return search
|
|
@@ -511,6 +493,7 @@ class DatabaseLogic(BaseDatabaseLogic):
|
|
|
511
493
|
token: Optional[str],
|
|
512
494
|
sort: Optional[Dict[str, Dict[str, str]]],
|
|
513
495
|
collection_ids: Optional[List[str]],
|
|
496
|
+
datetime_search: Dict[str, Optional[str]],
|
|
514
497
|
ignore_unavailable: bool = True,
|
|
515
498
|
) -> Tuple[Iterable[Dict[str, Any]], Optional[int], Optional[str]]:
|
|
516
499
|
"""Execute a search query with limit and other optional parameters.
|
|
@@ -521,6 +504,7 @@ class DatabaseLogic(BaseDatabaseLogic):
|
|
|
521
504
|
token (Optional[str]): The token used to return the next set of results.
|
|
522
505
|
sort (Optional[Dict[str, Dict[str, str]]]): Specifies how the results should be sorted.
|
|
523
506
|
collection_ids (Optional[List[str]]): The collection ids to search.
|
|
507
|
+
datetime_search (Dict[str, Optional[str]]): Datetime range used for index selection.
|
|
524
508
|
ignore_unavailable (bool, optional): Whether to ignore unavailable collections. Defaults to True.
|
|
525
509
|
|
|
526
510
|
Returns:
|
|
@@ -537,7 +521,9 @@ class DatabaseLogic(BaseDatabaseLogic):
|
|
|
537
521
|
search_body: Dict[str, Any] = {}
|
|
538
522
|
query = search.query.to_dict() if search.query else None
|
|
539
523
|
|
|
540
|
-
index_param =
|
|
524
|
+
index_param = await self.async_index_selector.select_indexes(
|
|
525
|
+
collection_ids, datetime_search
|
|
526
|
+
)
|
|
541
527
|
if len(index_param) > ES_MAX_URL_LENGTH - 300:
|
|
542
528
|
index_param = ITEM_INDICES
|
|
543
529
|
query = add_collections_to_body(collection_ids, query)
|
|
@@ -614,6 +600,7 @@ class DatabaseLogic(BaseDatabaseLogic):
|
|
|
614
600
|
geometry_geohash_grid_precision: int,
|
|
615
601
|
geometry_geotile_grid_precision: int,
|
|
616
602
|
datetime_frequency_interval: str,
|
|
603
|
+
datetime_search,
|
|
617
604
|
ignore_unavailable: Optional[bool] = True,
|
|
618
605
|
):
|
|
619
606
|
"""Return aggregations of STAC Items."""
|
|
@@ -647,7 +634,10 @@ class DatabaseLogic(BaseDatabaseLogic):
|
|
|
647
634
|
if k in aggregations
|
|
648
635
|
}
|
|
649
636
|
|
|
650
|
-
index_param =
|
|
637
|
+
index_param = await self.async_index_selector.select_indexes(
|
|
638
|
+
collection_ids, datetime_search
|
|
639
|
+
)
|
|
640
|
+
|
|
651
641
|
search_task = asyncio.create_task(
|
|
652
642
|
self.client.search(
|
|
653
643
|
index=index_param,
|
|
@@ -840,8 +830,13 @@ class DatabaseLogic(BaseDatabaseLogic):
|
|
|
840
830
|
item = await self.async_prep_create_item(
|
|
841
831
|
item=item, base_url=base_url, exist_ok=exist_ok
|
|
842
832
|
)
|
|
833
|
+
|
|
834
|
+
target_index = await self.async_index_inserter.get_target_index(
|
|
835
|
+
collection_id, item
|
|
836
|
+
)
|
|
837
|
+
|
|
843
838
|
await self.client.index(
|
|
844
|
-
index=
|
|
839
|
+
index=target_index,
|
|
845
840
|
id=mk_item_id(item_id, collection_id),
|
|
846
841
|
body=item,
|
|
847
842
|
refresh=refresh,
|
|
@@ -874,6 +869,7 @@ class DatabaseLogic(BaseDatabaseLogic):
|
|
|
874
869
|
item_id=item_id,
|
|
875
870
|
operations=operations,
|
|
876
871
|
base_url=base_url,
|
|
872
|
+
create_nest=True,
|
|
877
873
|
refresh=refresh,
|
|
878
874
|
)
|
|
879
875
|
|
|
@@ -883,6 +879,7 @@ class DatabaseLogic(BaseDatabaseLogic):
|
|
|
883
879
|
item_id: str,
|
|
884
880
|
operations: List[PatchOperation],
|
|
885
881
|
base_url: str,
|
|
882
|
+
create_nest: bool = False,
|
|
886
883
|
refresh: bool = True,
|
|
887
884
|
) -> Item:
|
|
888
885
|
"""Database logic for json patching an item following RF6902.
|
|
@@ -917,16 +914,31 @@ class DatabaseLogic(BaseDatabaseLogic):
|
|
|
917
914
|
else:
|
|
918
915
|
script_operations.append(operation)
|
|
919
916
|
|
|
920
|
-
script = operations_to_script(script_operations)
|
|
917
|
+
script = operations_to_script(script_operations, create_nest=create_nest)
|
|
921
918
|
|
|
922
919
|
try:
|
|
923
|
-
await self.client.
|
|
920
|
+
search_response = await self.client.search(
|
|
924
921
|
index=index_alias_by_collection_id(collection_id),
|
|
922
|
+
body={
|
|
923
|
+
"query": {"term": {"_id": mk_item_id(item_id, collection_id)}},
|
|
924
|
+
"size": 1,
|
|
925
|
+
},
|
|
926
|
+
)
|
|
927
|
+
if search_response["hits"]["total"]["value"] == 0:
|
|
928
|
+
raise NotFoundError(
|
|
929
|
+
f"Item {item_id} does not exist inside Collection {collection_id}"
|
|
930
|
+
)
|
|
931
|
+
document_index = search_response["hits"]["hits"][0]["_index"]
|
|
932
|
+
await self.client.update(
|
|
933
|
+
index=document_index,
|
|
925
934
|
id=mk_item_id(item_id, collection_id),
|
|
926
935
|
body={"script": script},
|
|
927
936
|
refresh=True,
|
|
928
937
|
)
|
|
929
|
-
|
|
938
|
+
except exceptions.NotFoundError:
|
|
939
|
+
raise NotFoundError(
|
|
940
|
+
f"Item {item_id} does not exist inside Collection {collection_id}"
|
|
941
|
+
)
|
|
930
942
|
except exceptions.RequestError as exc:
|
|
931
943
|
raise HTTPException(
|
|
932
944
|
status_code=400, detail=exc.info["error"]["caused_by"]
|
|
@@ -945,8 +957,8 @@ class DatabaseLogic(BaseDatabaseLogic):
|
|
|
945
957
|
"script": {
|
|
946
958
|
"lang": "painless",
|
|
947
959
|
"source": (
|
|
948
|
-
f"""ctx._id = ctx._id.replace('{collection_id}', '{new_collection_id}');"""
|
|
949
|
-
f"""ctx._source.collection = '{new_collection_id}';"""
|
|
960
|
+
f"""ctx._id = ctx._id.replace('{collection_id}', '{new_collection_id}');""" # noqa: E702
|
|
961
|
+
f"""ctx._source.collection = '{new_collection_id}';""" # noqa: E702
|
|
950
962
|
),
|
|
951
963
|
},
|
|
952
964
|
},
|
|
@@ -1000,9 +1012,9 @@ class DatabaseLogic(BaseDatabaseLogic):
|
|
|
1000
1012
|
)
|
|
1001
1013
|
|
|
1002
1014
|
try:
|
|
1003
|
-
await self.client.
|
|
1015
|
+
await self.client.delete_by_query(
|
|
1004
1016
|
index=index_alias_by_collection_id(collection_id),
|
|
1005
|
-
|
|
1017
|
+
body={"query": {"term": {"_id": mk_item_id(item_id, collection_id)}}},
|
|
1006
1018
|
refresh=refresh,
|
|
1007
1019
|
)
|
|
1008
1020
|
except exceptions.NotFoundError:
|
|
@@ -1093,8 +1105,10 @@ class DatabaseLogic(BaseDatabaseLogic):
|
|
|
1093
1105
|
body=collection,
|
|
1094
1106
|
refresh=refresh,
|
|
1095
1107
|
)
|
|
1096
|
-
|
|
1097
|
-
|
|
1108
|
+
if self.async_index_inserter.should_create_collection_index():
|
|
1109
|
+
await self.async_index_inserter.create_simple_index(
|
|
1110
|
+
self.client, collection_id
|
|
1111
|
+
)
|
|
1098
1112
|
|
|
1099
1113
|
async def find_collection(self, collection_id: str) -> Collection:
|
|
1100
1114
|
"""Find and return a collection from the database.
|
|
@@ -1208,6 +1222,7 @@ class DatabaseLogic(BaseDatabaseLogic):
|
|
|
1208
1222
|
collection_id=collection_id,
|
|
1209
1223
|
operations=operations,
|
|
1210
1224
|
base_url=base_url,
|
|
1225
|
+
create_nest=True,
|
|
1211
1226
|
refresh=refresh,
|
|
1212
1227
|
)
|
|
1213
1228
|
|
|
@@ -1216,6 +1231,7 @@ class DatabaseLogic(BaseDatabaseLogic):
|
|
|
1216
1231
|
collection_id: str,
|
|
1217
1232
|
operations: List[PatchOperation],
|
|
1218
1233
|
base_url: str,
|
|
1234
|
+
create_nest: bool = False,
|
|
1219
1235
|
refresh: bool = True,
|
|
1220
1236
|
) -> Collection:
|
|
1221
1237
|
"""Database logic for json patching a collection following RF6902.
|
|
@@ -1243,7 +1259,7 @@ class DatabaseLogic(BaseDatabaseLogic):
|
|
|
1243
1259
|
else:
|
|
1244
1260
|
script_operations.append(operation)
|
|
1245
1261
|
|
|
1246
|
-
script = operations_to_script(script_operations)
|
|
1262
|
+
script = operations_to_script(script_operations, create_nest=create_nest)
|
|
1247
1263
|
|
|
1248
1264
|
try:
|
|
1249
1265
|
await self.client.update(
|
|
@@ -1303,6 +1319,7 @@ class DatabaseLogic(BaseDatabaseLogic):
|
|
|
1303
1319
|
await self.client.delete(
|
|
1304
1320
|
index=COLLECTIONS_INDEX, id=collection_id, refresh=refresh
|
|
1305
1321
|
)
|
|
1322
|
+
# Delete the item index for the collection
|
|
1306
1323
|
await delete_item_index(collection_id)
|
|
1307
1324
|
|
|
1308
1325
|
async def bulk_async(
|
|
@@ -1356,9 +1373,13 @@ class DatabaseLogic(BaseDatabaseLogic):
|
|
|
1356
1373
|
return 0, []
|
|
1357
1374
|
|
|
1358
1375
|
raise_on_error = self.async_settings.raise_on_bulk_error
|
|
1376
|
+
actions = await self.async_index_inserter.prepare_bulk_actions(
|
|
1377
|
+
collection_id, processed_items
|
|
1378
|
+
)
|
|
1379
|
+
|
|
1359
1380
|
success, errors = await helpers.async_bulk(
|
|
1360
1381
|
self.client,
|
|
1361
|
-
|
|
1382
|
+
actions,
|
|
1362
1383
|
refresh=refresh,
|
|
1363
1384
|
raise_on_error=raise_on_error,
|
|
1364
1385
|
)
|
|
@@ -1413,6 +1434,11 @@ class DatabaseLogic(BaseDatabaseLogic):
|
|
|
1413
1434
|
f"Performing bulk insert for collection {collection_id} with refresh={refresh}"
|
|
1414
1435
|
)
|
|
1415
1436
|
|
|
1437
|
+
# Handle empty processed_items
|
|
1438
|
+
if not processed_items:
|
|
1439
|
+
logger.warning(f"No items to insert for collection {collection_id}")
|
|
1440
|
+
return 0, []
|
|
1441
|
+
|
|
1416
1442
|
# Handle empty processed_items
|
|
1417
1443
|
if not processed_items:
|
|
1418
1444
|
logger.warning(f"No items to insert for collection {collection_id}")
|
{stac_fastapi_opensearch-6.1.0 → stac_fastapi_opensearch-6.2.1}/stac_fastapi/opensearch/version.py
RENAMED
|
@@ -1,2 +1,2 @@
|
|
|
1
1
|
"""library version."""
|
|
2
|
-
__version__ = "6.1
|
|
2
|
+
__version__ = "6.2.1"
|
|
@@ -1,3 +1,49 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: stac_fastapi_opensearch
|
|
3
|
+
Version: 6.2.1
|
|
4
|
+
Summary: Opensearch stac-fastapi backend.
|
|
5
|
+
Home-page: https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch
|
|
6
|
+
License: MIT
|
|
7
|
+
Classifier: Intended Audience :: Developers
|
|
8
|
+
Classifier: Intended Audience :: Information Technology
|
|
9
|
+
Classifier: Intended Audience :: Science/Research
|
|
10
|
+
Classifier: Programming Language :: Python :: 3.9
|
|
11
|
+
Classifier: Programming Language :: Python :: 3.10
|
|
12
|
+
Classifier: Programming Language :: Python :: 3.11
|
|
13
|
+
Classifier: Programming Language :: Python :: 3.12
|
|
14
|
+
Classifier: Programming Language :: Python :: 3.13
|
|
15
|
+
Classifier: License :: OSI Approved :: MIT License
|
|
16
|
+
Requires-Python: >=3.9
|
|
17
|
+
Description-Content-Type: text/markdown
|
|
18
|
+
Requires-Dist: stac-fastapi-core==6.2.1
|
|
19
|
+
Requires-Dist: sfeos-helpers==6.2.1
|
|
20
|
+
Requires-Dist: opensearch-py~=2.8.0
|
|
21
|
+
Requires-Dist: opensearch-py[async]~=2.8.0
|
|
22
|
+
Requires-Dist: uvicorn~=0.23.0
|
|
23
|
+
Requires-Dist: starlette<0.36.0,>=0.35.0
|
|
24
|
+
Provides-Extra: dev
|
|
25
|
+
Requires-Dist: pytest~=7.0.0; extra == "dev"
|
|
26
|
+
Requires-Dist: pytest-cov~=4.0.0; extra == "dev"
|
|
27
|
+
Requires-Dist: pytest-asyncio~=0.21.0; extra == "dev"
|
|
28
|
+
Requires-Dist: pre-commit~=3.0.0; extra == "dev"
|
|
29
|
+
Requires-Dist: ciso8601~=2.3.0; extra == "dev"
|
|
30
|
+
Requires-Dist: httpx<0.28.0,>=0.24.0; extra == "dev"
|
|
31
|
+
Provides-Extra: docs
|
|
32
|
+
Requires-Dist: mkdocs~=1.4.0; extra == "docs"
|
|
33
|
+
Requires-Dist: mkdocs-material~=9.0.0; extra == "docs"
|
|
34
|
+
Requires-Dist: pdocs~=1.2.0; extra == "docs"
|
|
35
|
+
Provides-Extra: server
|
|
36
|
+
Requires-Dist: uvicorn[standard]~=0.23.0; extra == "server"
|
|
37
|
+
Dynamic: classifier
|
|
38
|
+
Dynamic: description
|
|
39
|
+
Dynamic: description-content-type
|
|
40
|
+
Dynamic: home-page
|
|
41
|
+
Dynamic: license
|
|
42
|
+
Dynamic: provides-extra
|
|
43
|
+
Dynamic: requires-dist
|
|
44
|
+
Dynamic: requires-python
|
|
45
|
+
Dynamic: summary
|
|
46
|
+
|
|
1
47
|
# stac-fastapi-elasticsearch-opensearch
|
|
2
48
|
|
|
3
49
|
<!-- markdownlint-disable MD033 MD041 -->
|
|
@@ -85,6 +131,7 @@ This project is built on the following technologies: STAC, stac-fastapi, FastAPI
|
|
|
85
131
|
- [Auth](#auth)
|
|
86
132
|
- [Aggregation](#aggregation)
|
|
87
133
|
- [Rate Limiting](#rate-limiting)
|
|
134
|
+
- [Datetime-Based Index Management](#datetime-based-index-management)
|
|
88
135
|
|
|
89
136
|
## Documentation & Resources
|
|
90
137
|
|
|
@@ -226,10 +273,86 @@ You can customize additional settings in your `.env` file:
|
|
|
226
273
|
| `RAISE_ON_BULK_ERROR` | Controls whether bulk insert operations raise exceptions on errors. If set to `true`, the operation will stop and raise an exception when an error occurs. If set to `false`, errors will be logged, and the operation will continue. **Note:** STAC Item and ItemCollection validation errors will always raise, regardless of this flag. | `false` | Optional |
|
|
227
274
|
| `DATABASE_REFRESH` | Controls whether database operations refresh the index immediately after changes. If set to `true`, changes will be immediately searchable. If set to `false`, changes may not be immediately visible but can improve performance for bulk operations. If set to `wait_for`, changes will wait for the next refresh cycle to become visible. | `false` | Optional |
|
|
228
275
|
| `ENABLE_TRANSACTIONS_EXTENSIONS` | Enables or disables the Transactions and Bulk Transactions API extensions. If set to `false`, the POST `/collections` route and related transaction endpoints (including bulk transaction operations) will be unavailable in the API. This is useful for deployments where mutating the catalog via the API should be prevented. | `true` | Optional |
|
|
276
|
+
| `STAC_ITEM_LIMIT` | Sets the environment variable for result limiting to SFEOS for the number of returned items and STAC collections. | `10` | Optional |
|
|
229
277
|
|
|
230
278
|
> [!NOTE]
|
|
231
279
|
> The variables `ES_HOST`, `ES_PORT`, `ES_USE_SSL`, `ES_VERIFY_CERTS` and `ES_TIMEOUT` apply to both Elasticsearch and OpenSearch backends, so there is no need to rename the key names to `OS_` even if you're using OpenSearch.
|
|
232
280
|
|
|
281
|
+
## Datetime-Based Index Management
|
|
282
|
+
|
|
283
|
+
### Overview
|
|
284
|
+
|
|
285
|
+
SFEOS supports two indexing strategies for managing STAC items:
|
|
286
|
+
|
|
287
|
+
1. **Simple Indexing** (default) - One index per collection
|
|
288
|
+
2. **Datetime-Based Indexing** - Time-partitioned indexes with automatic management
|
|
289
|
+
|
|
290
|
+
The datetime-based indexing strategy is particularly useful for large temporal datasets. When a user provides a datetime parameter in a query, the system knows exactly which index to search, providing **multiple times faster searches** and significantly **reducing database load**.
|
|
291
|
+
|
|
292
|
+
### When to Use
|
|
293
|
+
|
|
294
|
+
**Recommended for:**
|
|
295
|
+
- Systems with large collections containing millions of items
|
|
296
|
+
- Systems requiring high-performance temporal searching
|
|
297
|
+
|
|
298
|
+
**Pros:**
|
|
299
|
+
- Multiple times faster queries with datetime filter
|
|
300
|
+
- Reduced database load - only relevant indexes are searched
|
|
301
|
+
|
|
302
|
+
**Cons:**
|
|
303
|
+
- Slightly longer item indexing time (automatic index management)
|
|
304
|
+
- Greater management complexity
|
|
305
|
+
|
|
306
|
+
### Configuration
|
|
307
|
+
|
|
308
|
+
#### Enabling Datetime-Based Indexing
|
|
309
|
+
|
|
310
|
+
Enable datetime-based indexing by setting the following environment variable:
|
|
311
|
+
|
|
312
|
+
```bash
|
|
313
|
+
ENABLE_DATETIME_INDEX_FILTERING=true
|
|
314
|
+
```
|
|
315
|
+
|
|
316
|
+
### Related Configuration Variables
|
|
317
|
+
|
|
318
|
+
| Variable | Description | Default | Example |
|
|
319
|
+
|----------|-------------|---------|---------|
|
|
320
|
+
| `ENABLE_DATETIME_INDEX_FILTERING` | Enables time-based index partitioning | `false` | `true` |
|
|
321
|
+
| `DATETIME_INDEX_MAX_SIZE_GB` | Maximum size limit for datetime indexes (GB) - note: add +20% to target size due to ES/OS compression | `25` | `50` |
|
|
322
|
+
| `STAC_ITEMS_INDEX_PREFIX` | Prefix for item indexes | `items_` | `stac_items_` |
|
|
323
|
+
|
|
324
|
+
## How Datetime-Based Indexing Works
|
|
325
|
+
|
|
326
|
+
### Index and Alias Naming Convention
|
|
327
|
+
|
|
328
|
+
The system uses a precise naming convention:
|
|
329
|
+
|
|
330
|
+
**Physical indexes:**
|
|
331
|
+
```
|
|
332
|
+
{ITEMS_INDEX_PREFIX}{collection-id}_{uuid4}
|
|
333
|
+
```
|
|
334
|
+
|
|
335
|
+
**Aliases:**
|
|
336
|
+
```
|
|
337
|
+
{ITEMS_INDEX_PREFIX}{collection-id} # Main collection alias
|
|
338
|
+
{ITEMS_INDEX_PREFIX}{collection-id}_{start-datetime} # Temporal alias
|
|
339
|
+
{ITEMS_INDEX_PREFIX}{collection-id}_{start-datetime}_{end-datetime} # Closed index alias
|
|
340
|
+
```
|
|
341
|
+
|
|
342
|
+
**Example:**
|
|
343
|
+
|
|
344
|
+
*Physical indexes:*
|
|
345
|
+
- `items_sentinel-2-l2a_a1b2c3d4-e5f6-7890-abcd-ef1234567890`
|
|
346
|
+
|
|
347
|
+
*Aliases:*
|
|
348
|
+
- `items_sentinel-2-l2a` - main collection alias
|
|
349
|
+
- `items_sentinel-2-l2a_2024-01-01` - active alias from January 1, 2024
|
|
350
|
+
- `items_sentinel-2-l2a_2024-01-01_2024-03-15` - closed index alias (reached size limit)
|
|
351
|
+
|
|
352
|
+
### Index Size Management
|
|
353
|
+
|
|
354
|
+
**Important - Data Compression:** Elasticsearch and OpenSearch automatically compress data. The configured `DATETIME_INDEX_MAX_SIZE_GB` limit refers to the compressed size on disk. It is recommended to add +20% to the target size to account for compression overhead and metadata.
|
|
355
|
+
|
|
233
356
|
## Interacting with the API
|
|
234
357
|
|
|
235
358
|
- **Creating a Collection**:
|
|
@@ -538,4 +661,3 @@ You can customize additional settings in your `.env` file:
|
|
|
538
661
|
- Ensures fair resource allocation among all clients
|
|
539
662
|
|
|
540
663
|
- **Examples**: Implementation examples are available in the [examples/rate_limit](examples/rate_limit) directory.
|
|
541
|
-
|
|
File without changes
|
{stac_fastapi_opensearch-6.1.0 → stac_fastapi_opensearch-6.2.1}/stac_fastapi/opensearch/__init__.py
RENAMED
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|