stac-fastapi-opensearch 6.1.0__tar.gz → 6.2.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (16) hide show
  1. {stac_fastapi_opensearch-6.1.0 → stac_fastapi_opensearch-6.2.0}/PKG-INFO +77 -2
  2. {stac_fastapi_opensearch-6.1.0 → stac_fastapi_opensearch-6.2.0}/README.md +76 -1
  3. {stac_fastapi_opensearch-6.1.0 → stac_fastapi_opensearch-6.2.0}/setup.py +2 -2
  4. {stac_fastapi_opensearch-6.1.0 → stac_fastapi_opensearch-6.2.0}/stac_fastapi/opensearch/app.py +1 -1
  5. {stac_fastapi_opensearch-6.1.0 → stac_fastapi_opensearch-6.2.0}/stac_fastapi/opensearch/config.py +1 -0
  6. {stac_fastapi_opensearch-6.1.0 → stac_fastapi_opensearch-6.2.0}/stac_fastapi/opensearch/database_logic.py +89 -67
  7. {stac_fastapi_opensearch-6.1.0 → stac_fastapi_opensearch-6.2.0}/stac_fastapi/opensearch/version.py +1 -1
  8. {stac_fastapi_opensearch-6.1.0 → stac_fastapi_opensearch-6.2.0}/stac_fastapi_opensearch.egg-info/PKG-INFO +77 -2
  9. {stac_fastapi_opensearch-6.1.0 → stac_fastapi_opensearch-6.2.0}/stac_fastapi_opensearch.egg-info/requires.txt +2 -2
  10. {stac_fastapi_opensearch-6.1.0 → stac_fastapi_opensearch-6.2.0}/setup.cfg +0 -0
  11. {stac_fastapi_opensearch-6.1.0 → stac_fastapi_opensearch-6.2.0}/stac_fastapi/opensearch/__init__.py +0 -0
  12. {stac_fastapi_opensearch-6.1.0 → stac_fastapi_opensearch-6.2.0}/stac_fastapi_opensearch.egg-info/SOURCES.txt +0 -0
  13. {stac_fastapi_opensearch-6.1.0 → stac_fastapi_opensearch-6.2.0}/stac_fastapi_opensearch.egg-info/dependency_links.txt +0 -0
  14. {stac_fastapi_opensearch-6.1.0 → stac_fastapi_opensearch-6.2.0}/stac_fastapi_opensearch.egg-info/entry_points.txt +0 -0
  15. {stac_fastapi_opensearch-6.1.0 → stac_fastapi_opensearch-6.2.0}/stac_fastapi_opensearch.egg-info/not-zip-safe +0 -0
  16. {stac_fastapi_opensearch-6.1.0 → stac_fastapi_opensearch-6.2.0}/stac_fastapi_opensearch.egg-info/top_level.txt +0 -0
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.1
2
2
  Name: stac_fastapi_opensearch
3
- Version: 6.1.0
3
+ Version: 6.2.0
4
4
  Summary: Opensearch stac-fastapi backend.
5
5
  Home-page: https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch
6
6
  License: MIT
@@ -106,6 +106,7 @@ This project is built on the following technologies: STAC, stac-fastapi, FastAPI
106
106
  - [Auth](#auth)
107
107
  - [Aggregation](#aggregation)
108
108
  - [Rate Limiting](#rate-limiting)
109
+ - [Datetime-Based Index Management](#datetime-based-index-management)
109
110
 
110
111
  ## Documentation & Resources
111
112
 
@@ -251,6 +252,81 @@ You can customize additional settings in your `.env` file:
251
252
  > [!NOTE]
252
253
  > The variables `ES_HOST`, `ES_PORT`, `ES_USE_SSL`, `ES_VERIFY_CERTS` and `ES_TIMEOUT` apply to both Elasticsearch and OpenSearch backends, so there is no need to rename the key names to `OS_` even if you're using OpenSearch.
253
254
 
255
+ ## Datetime-Based Index Management
256
+
257
+ ### Overview
258
+
259
+ SFEOS supports two indexing strategies for managing STAC items:
260
+
261
+ 1. **Simple Indexing** (default) - One index per collection
262
+ 2. **Datetime-Based Indexing** - Time-partitioned indexes with automatic management
263
+
264
+ The datetime-based indexing strategy is particularly useful for large temporal datasets. When a user provides a datetime parameter in a query, the system knows exactly which index to search, providing **multiple times faster searches** and significantly **reducing database load**.
265
+
266
+ ### When to Use
267
+
268
+ **Recommended for:**
269
+ - Systems with large collections containing millions of items
270
+ - Systems requiring high-performance temporal searching
271
+
272
+ **Pros:**
273
+ - Multiple times faster queries with datetime filter
274
+ - Reduced database load - only relevant indexes are searched
275
+
276
+ **Cons:**
277
+ - Slightly longer item indexing time (automatic index management)
278
+ - Greater management complexity
279
+
280
+ ### Configuration
281
+
282
+ #### Enabling Datetime-Based Indexing
283
+
284
+ Enable datetime-based indexing by setting the following environment variable:
285
+
286
+ ```bash
287
+ ENABLE_DATETIME_INDEX_FILTERING=true
288
+ ```
289
+
290
+ ### Related Configuration Variables
291
+
292
+ | Variable | Description | Default | Example |
293
+ |----------|-------------|---------|---------|
294
+ | `ENABLE_DATETIME_INDEX_FILTERING` | Enables time-based index partitioning | `false` | `true` |
295
+ | `DATETIME_INDEX_MAX_SIZE_GB` | Maximum size limit for datetime indexes (GB) - note: add +20% to target size due to ES/OS compression | `25` | `50` |
296
+ | `STAC_ITEMS_INDEX_PREFIX` | Prefix for item indexes | `items_` | `stac_items_` |
297
+
298
+ ## How Datetime-Based Indexing Works
299
+
300
+ ### Index and Alias Naming Convention
301
+
302
+ The system uses a precise naming convention:
303
+
304
+ **Physical indexes:**
305
+ ```
306
+ {ITEMS_INDEX_PREFIX}{collection-id}_{uuid4}
307
+ ```
308
+
309
+ **Aliases:**
310
+ ```
311
+ {ITEMS_INDEX_PREFIX}{collection-id} # Main collection alias
312
+ {ITEMS_INDEX_PREFIX}{collection-id}_{start-datetime} # Temporal alias
313
+ {ITEMS_INDEX_PREFIX}{collection-id}_{start-datetime}_{end-datetime} # Closed index alias
314
+ ```
315
+
316
+ **Example:**
317
+
318
+ *Physical indexes:*
319
+ - `items_sentinel-2-l2a_a1b2c3d4-e5f6-7890-abcd-ef1234567890`
320
+
321
+ *Aliases:*
322
+ - `items_sentinel-2-l2a` - main collection alias
323
+ - `items_sentinel-2-l2a_2024-01-01` - active alias from January 1, 2024
324
+ - `items_sentinel-2-l2a_2024-01-01_2024-03-15` - closed index alias (reached size limit)
325
+
326
+ ### Index Size Management
327
+
328
+ **Important - Data Compression:** Elasticsearch and OpenSearch automatically compress data. The configured `DATETIME_INDEX_MAX_SIZE_GB` limit refers to the compressed size on disk. It is recommended to add +20% to the target size to account for compression overhead and metadata.
329
+
254
330
  ## Interacting with the API
255
331
 
256
332
  - **Creating a Collection**:
@@ -559,4 +635,3 @@ You can customize additional settings in your `.env` file:
559
635
  - Ensures fair resource allocation among all clients
560
636
 
561
637
  - **Examples**: Implementation examples are available in the [examples/rate_limit](examples/rate_limit) directory.
562
-
@@ -85,6 +85,7 @@ This project is built on the following technologies: STAC, stac-fastapi, FastAPI
85
85
  - [Auth](#auth)
86
86
  - [Aggregation](#aggregation)
87
87
  - [Rate Limiting](#rate-limiting)
88
+ - [Datetime-Based Index Management](#datetime-based-index-management)
88
89
 
89
90
  ## Documentation & Resources
90
91
 
@@ -230,6 +231,81 @@ You can customize additional settings in your `.env` file:
230
231
  > [!NOTE]
231
232
  > The variables `ES_HOST`, `ES_PORT`, `ES_USE_SSL`, `ES_VERIFY_CERTS` and `ES_TIMEOUT` apply to both Elasticsearch and OpenSearch backends, so there is no need to rename the key names to `OS_` even if you're using OpenSearch.
232
233
 
234
+ ## Datetime-Based Index Management
235
+
236
+ ### Overview
237
+
238
+ SFEOS supports two indexing strategies for managing STAC items:
239
+
240
+ 1. **Simple Indexing** (default) - One index per collection
241
+ 2. **Datetime-Based Indexing** - Time-partitioned indexes with automatic management
242
+
243
+ The datetime-based indexing strategy is particularly useful for large temporal datasets. When a user provides a datetime parameter in a query, the system knows exactly which index to search, providing **multiple times faster searches** and significantly **reducing database load**.
244
+
245
+ ### When to Use
246
+
247
+ **Recommended for:**
248
+ - Systems with large collections containing millions of items
249
+ - Systems requiring high-performance temporal searching
250
+
251
+ **Pros:**
252
+ - Multiple times faster queries with datetime filter
253
+ - Reduced database load - only relevant indexes are searched
254
+
255
+ **Cons:**
256
+ - Slightly longer item indexing time (automatic index management)
257
+ - Greater management complexity
258
+
259
+ ### Configuration
260
+
261
+ #### Enabling Datetime-Based Indexing
262
+
263
+ Enable datetime-based indexing by setting the following environment variable:
264
+
265
+ ```bash
266
+ ENABLE_DATETIME_INDEX_FILTERING=true
267
+ ```
268
+
269
+ ### Related Configuration Variables
270
+
271
+ | Variable | Description | Default | Example |
272
+ |----------|-------------|---------|---------|
273
+ | `ENABLE_DATETIME_INDEX_FILTERING` | Enables time-based index partitioning | `false` | `true` |
274
+ | `DATETIME_INDEX_MAX_SIZE_GB` | Maximum size limit for datetime indexes (GB) - note: add +20% to target size due to ES/OS compression | `25` | `50` |
275
+ | `STAC_ITEMS_INDEX_PREFIX` | Prefix for item indexes | `items_` | `stac_items_` |
276
+
277
+ ## How Datetime-Based Indexing Works
278
+
279
+ ### Index and Alias Naming Convention
280
+
281
+ The system uses a precise naming convention:
282
+
283
+ **Physical indexes:**
284
+ ```
285
+ {ITEMS_INDEX_PREFIX}{collection-id}_{uuid4}
286
+ ```
287
+
288
+ **Aliases:**
289
+ ```
290
+ {ITEMS_INDEX_PREFIX}{collection-id} # Main collection alias
291
+ {ITEMS_INDEX_PREFIX}{collection-id}_{start-datetime} # Temporal alias
292
+ {ITEMS_INDEX_PREFIX}{collection-id}_{start-datetime}_{end-datetime} # Closed index alias
293
+ ```
294
+
295
+ **Example:**
296
+
297
+ *Physical indexes:*
298
+ - `items_sentinel-2-l2a_a1b2c3d4-e5f6-7890-abcd-ef1234567890`
299
+
300
+ *Aliases:*
301
+ - `items_sentinel-2-l2a` - main collection alias
302
+ - `items_sentinel-2-l2a_2024-01-01` - active alias from January 1, 2024
303
+ - `items_sentinel-2-l2a_2024-01-01_2024-03-15` - closed index alias (reached size limit)
304
+
305
+ ### Index Size Management
306
+
307
+ **Important - Data Compression:** Elasticsearch and OpenSearch automatically compress data. The configured `DATETIME_INDEX_MAX_SIZE_GB` limit refers to the compressed size on disk. It is recommended to add +20% to the target size to account for compression overhead and metadata.
308
+
233
309
  ## Interacting with the API
234
310
 
235
311
  - **Creating a Collection**:
@@ -538,4 +614,3 @@ You can customize additional settings in your `.env` file:
538
614
  - Ensures fair resource allocation among all clients
539
615
 
540
616
  - **Examples**: Implementation examples are available in the [examples/rate_limit](examples/rate_limit) directory.
541
-
@@ -6,8 +6,8 @@ with open("README.md") as f:
6
6
  desc = f.read()
7
7
 
8
8
  install_requires = [
9
- "stac-fastapi-core==6.1.0",
10
- "sfeos-helpers==6.1.0",
9
+ "stac-fastapi-core==6.2.0",
10
+ "sfeos-helpers==6.2.0",
11
11
  "opensearch-py~=2.8.0",
12
12
  "opensearch-py[async]~=2.8.0",
13
13
  "uvicorn~=0.23.0",
@@ -118,7 +118,7 @@ post_request_model = create_post_request_model(search_extensions)
118
118
  app_config = {
119
119
  "title": os.getenv("STAC_FASTAPI_TITLE", "stac-fastapi-opensearch"),
120
120
  "description": os.getenv("STAC_FASTAPI_DESCRIPTION", "stac-fastapi-opensearch"),
121
- "api_version": os.getenv("STAC_FASTAPI_VERSION", "6.1.0"),
121
+ "api_version": os.getenv("STAC_FASTAPI_VERSION", "6.2.0"),
122
122
  "settings": settings,
123
123
  "extensions": extensions,
124
124
  "client": CoreClient(
@@ -1,4 +1,5 @@
1
1
  """API configuration."""
2
+
2
3
  import logging
3
4
  import os
4
5
  import ssl
@@ -4,7 +4,7 @@ import asyncio
4
4
  import logging
5
5
  from base64 import urlsafe_b64decode, urlsafe_b64encode
6
6
  from copy import deepcopy
7
- from typing import Any, Dict, Iterable, List, Optional, Tuple, Type, Union
7
+ from typing import Any, Dict, Iterable, List, Optional, Tuple, Type
8
8
 
9
9
  import attr
10
10
  import orjson
@@ -26,7 +26,7 @@ from stac_fastapi.opensearch.config import (
26
26
  AsyncOpensearchSettings as AsyncSearchSettings,
27
27
  )
28
28
  from stac_fastapi.opensearch.config import OpensearchSettings as SyncSearchSettings
29
- from stac_fastapi.sfeos_helpers import filter
29
+ from stac_fastapi.sfeos_helpers import filter as filter_module
30
30
  from stac_fastapi.sfeos_helpers.database import (
31
31
  apply_free_text_filter_shared,
32
32
  apply_intersects_filter_shared,
@@ -34,8 +34,6 @@ from stac_fastapi.sfeos_helpers.database import (
34
34
  delete_item_index_shared,
35
35
  get_queryables_mapping_shared,
36
36
  index_alias_by_collection_id,
37
- index_by_collection_id,
38
- indices,
39
37
  mk_actions,
40
38
  mk_item_id,
41
39
  populate_sort_shared,
@@ -55,15 +53,18 @@ from stac_fastapi.sfeos_helpers.mappings import (
55
53
  COLLECTIONS_INDEX,
56
54
  DEFAULT_SORT,
57
55
  ES_COLLECTIONS_MAPPINGS,
58
- ES_ITEMS_MAPPINGS,
59
- ES_ITEMS_SETTINGS,
60
56
  ITEM_INDICES,
61
57
  ITEMS_INDEX_PREFIX,
62
58
  Geometry,
63
59
  )
60
+ from stac_fastapi.sfeos_helpers.search_engine import (
61
+ BaseIndexInserter,
62
+ BaseIndexSelector,
63
+ IndexInsertionFactory,
64
+ IndexSelectorFactory,
65
+ )
64
66
  from stac_fastapi.types.errors import ConflictError, NotFoundError
65
67
  from stac_fastapi.types.links import resolve_links
66
- from stac_fastapi.types.rfc3339 import DateTimeType
67
68
  from stac_fastapi.types.stac import Collection, Item
68
69
 
69
70
  logger = logging.getLogger(__name__)
@@ -104,33 +105,6 @@ async def create_collection_index() -> None:
104
105
  await client.close()
105
106
 
106
107
 
107
- async def create_item_index(collection_id: str) -> None:
108
- """
109
- Create the index for Items. The settings of the index template will be used implicitly.
110
-
111
- Args:
112
- collection_id (str): Collection identifier.
113
-
114
- Returns:
115
- None
116
-
117
- """
118
- client = AsyncSearchSettings().create_client
119
-
120
- index_name = f"{index_by_collection_id(collection_id)}-000001"
121
- exists = await client.indices.exists(index=index_name)
122
- if not exists:
123
- await client.indices.create(
124
- index=index_name,
125
- body={
126
- "aliases": {index_alias_by_collection_id(collection_id): {}},
127
- "mappings": ES_ITEMS_MAPPINGS,
128
- "settings": ES_ITEMS_SETTINGS,
129
- },
130
- )
131
- await client.close()
132
-
133
-
134
108
  async def delete_item_index(collection_id: str) -> None:
135
109
  """Delete the index for items in a collection.
136
110
 
@@ -152,6 +126,9 @@ class DatabaseLogic(BaseDatabaseLogic):
152
126
  async_settings: AsyncSearchSettings = attr.ib(factory=AsyncSearchSettings)
153
127
  sync_settings: SyncSearchSettings = attr.ib(factory=SyncSearchSettings)
154
128
 
129
+ async_index_selector: BaseIndexSelector = attr.ib(init=False)
130
+ async_index_inserter: BaseIndexInserter = attr.ib(init=False)
131
+
155
132
  client = attr.ib(init=False)
156
133
  sync_client = attr.ib(init=False)
157
134
 
@@ -159,6 +136,10 @@ class DatabaseLogic(BaseDatabaseLogic):
159
136
  """Initialize clients after the class is instantiated."""
160
137
  self.client = self.async_settings.create_client
161
138
  self.sync_client = self.sync_settings.create_client
139
+ self.async_index_inserter = IndexInsertionFactory.create_insertion_strategy(
140
+ self.client
141
+ )
142
+ self.async_index_selector = IndexSelectorFactory.create_selector(self.client)
162
143
 
163
144
  item_serializer: Type[ItemSerializer] = attr.ib(default=ItemSerializer)
164
145
  collection_serializer: Type[CollectionSerializer] = attr.ib(
@@ -234,15 +215,23 @@ class DatabaseLogic(BaseDatabaseLogic):
234
215
  with the index for the Collection as the target index and the combined `mk_item_id` as the document id.
235
216
  """
236
217
  try:
237
- item = await self.client.get(
218
+ response = await self.client.search(
238
219
  index=index_alias_by_collection_id(collection_id),
239
- id=mk_item_id(item_id, collection_id),
220
+ body={
221
+ "query": {"term": {"_id": mk_item_id(item_id, collection_id)}},
222
+ "size": 1,
223
+ },
240
224
  )
225
+ if response["hits"]["total"]["value"] == 0:
226
+ raise NotFoundError(
227
+ f"Item {item_id} does not exist inside Collection {collection_id}"
228
+ )
229
+
230
+ return response["hits"]["hits"][0]["_source"]
241
231
  except exceptions.NotFoundError:
242
232
  raise NotFoundError(
243
233
  f"Item {item_id} does not exist inside Collection {collection_id}"
244
234
  )
245
- return item["_source"]
246
235
 
247
236
  async def get_queryables_mapping(self, collection_id: str = "*") -> dict:
248
237
  """Retrieve mapping of Queryables for search.
@@ -296,31 +285,21 @@ class DatabaseLogic(BaseDatabaseLogic):
296
285
 
297
286
  @staticmethod
298
287
  def apply_datetime_filter(
299
- search: Search, interval: Optional[Union[DateTimeType, str]]
300
- ) -> Search:
288
+ search: Search, datetime: Optional[str]
289
+ ) -> Tuple[Search, Dict[str, Optional[str]]]:
301
290
  """Apply a filter to search on datetime, start_datetime, and end_datetime fields.
302
291
 
303
292
  Args:
304
293
  search: The search object to filter.
305
- interval: Optional datetime interval to filter by. Can be:
306
- - A single datetime string (e.g., "2023-01-01T12:00:00")
307
- - A datetime range string (e.g., "2023-01-01/2023-12-31")
308
- - A datetime object
309
- - A tuple of (start_datetime, end_datetime)
294
+ datetime: Optional[str]
310
295
 
311
296
  Returns:
312
297
  The filtered search object.
313
298
  """
314
- if not interval:
315
- return search
299
+ datetime_search = return_date(datetime)
316
300
 
317
- should = []
318
- try:
319
- datetime_search = return_date(interval)
320
- except (ValueError, TypeError) as e:
321
- # Handle invalid interval formats if return_date fails
322
- logger.error(f"Invalid interval format: {interval}, error: {e}")
323
- return search
301
+ if not datetime_search:
302
+ return search, datetime_search
324
303
 
325
304
  if "eq" in datetime_search:
326
305
  # For exact matches, include:
@@ -387,7 +366,10 @@ class DatabaseLogic(BaseDatabaseLogic):
387
366
  ),
388
367
  ]
389
368
 
390
- return search.query(Q("bool", should=should, minimum_should_match=1))
369
+ return (
370
+ search.query(Q("bool", should=should, minimum_should_match=1)),
371
+ datetime_search,
372
+ )
391
373
 
392
374
  @staticmethod
393
375
  def apply_bbox_filter(search: Search, bbox: List):
@@ -484,7 +466,7 @@ class DatabaseLogic(BaseDatabaseLogic):
484
466
  otherwise the original Search object.
485
467
  """
486
468
  if _filter is not None:
487
- es_query = filter.to_es(await self.get_queryables_mapping(), _filter)
469
+ es_query = filter_module.to_es(await self.get_queryables_mapping(), _filter)
488
470
  search = search.filter(es_query)
489
471
 
490
472
  return search
@@ -511,6 +493,7 @@ class DatabaseLogic(BaseDatabaseLogic):
511
493
  token: Optional[str],
512
494
  sort: Optional[Dict[str, Dict[str, str]]],
513
495
  collection_ids: Optional[List[str]],
496
+ datetime_search: Dict[str, Optional[str]],
514
497
  ignore_unavailable: bool = True,
515
498
  ) -> Tuple[Iterable[Dict[str, Any]], Optional[int], Optional[str]]:
516
499
  """Execute a search query with limit and other optional parameters.
@@ -521,6 +504,7 @@ class DatabaseLogic(BaseDatabaseLogic):
521
504
  token (Optional[str]): The token used to return the next set of results.
522
505
  sort (Optional[Dict[str, Dict[str, str]]]): Specifies how the results should be sorted.
523
506
  collection_ids (Optional[List[str]]): The collection ids to search.
507
+ datetime_search (Dict[str, Optional[str]]): Datetime range used for index selection.
524
508
  ignore_unavailable (bool, optional): Whether to ignore unavailable collections. Defaults to True.
525
509
 
526
510
  Returns:
@@ -537,7 +521,9 @@ class DatabaseLogic(BaseDatabaseLogic):
537
521
  search_body: Dict[str, Any] = {}
538
522
  query = search.query.to_dict() if search.query else None
539
523
 
540
- index_param = indices(collection_ids)
524
+ index_param = await self.async_index_selector.select_indexes(
525
+ collection_ids, datetime_search
526
+ )
541
527
  if len(index_param) > ES_MAX_URL_LENGTH - 300:
542
528
  index_param = ITEM_INDICES
543
529
  query = add_collections_to_body(collection_ids, query)
@@ -614,6 +600,7 @@ class DatabaseLogic(BaseDatabaseLogic):
614
600
  geometry_geohash_grid_precision: int,
615
601
  geometry_geotile_grid_precision: int,
616
602
  datetime_frequency_interval: str,
603
+ datetime_search,
617
604
  ignore_unavailable: Optional[bool] = True,
618
605
  ):
619
606
  """Return aggregations of STAC Items."""
@@ -647,7 +634,10 @@ class DatabaseLogic(BaseDatabaseLogic):
647
634
  if k in aggregations
648
635
  }
649
636
 
650
- index_param = indices(collection_ids)
637
+ index_param = await self.async_index_selector.select_indexes(
638
+ collection_ids, datetime_search
639
+ )
640
+
651
641
  search_task = asyncio.create_task(
652
642
  self.client.search(
653
643
  index=index_param,
@@ -840,8 +830,13 @@ class DatabaseLogic(BaseDatabaseLogic):
840
830
  item = await self.async_prep_create_item(
841
831
  item=item, base_url=base_url, exist_ok=exist_ok
842
832
  )
833
+
834
+ target_index = await self.async_index_inserter.get_target_index(
835
+ collection_id, item
836
+ )
837
+
843
838
  await self.client.index(
844
- index=index_alias_by_collection_id(collection_id),
839
+ index=target_index,
845
840
  id=mk_item_id(item_id, collection_id),
846
841
  body=item,
847
842
  refresh=refresh,
@@ -920,13 +915,28 @@ class DatabaseLogic(BaseDatabaseLogic):
920
915
  script = operations_to_script(script_operations)
921
916
 
922
917
  try:
923
- await self.client.update(
918
+ search_response = await self.client.search(
924
919
  index=index_alias_by_collection_id(collection_id),
920
+ body={
921
+ "query": {"term": {"_id": mk_item_id(item_id, collection_id)}},
922
+ "size": 1,
923
+ },
924
+ )
925
+ if search_response["hits"]["total"]["value"] == 0:
926
+ raise NotFoundError(
927
+ f"Item {item_id} does not exist inside Collection {collection_id}"
928
+ )
929
+ document_index = search_response["hits"]["hits"][0]["_index"]
930
+ await self.client.update(
931
+ index=document_index,
925
932
  id=mk_item_id(item_id, collection_id),
926
933
  body={"script": script},
927
934
  refresh=True,
928
935
  )
929
-
936
+ except exceptions.NotFoundError:
937
+ raise NotFoundError(
938
+ f"Item {item_id} does not exist inside Collection {collection_id}"
939
+ )
930
940
  except exceptions.RequestError as exc:
931
941
  raise HTTPException(
932
942
  status_code=400, detail=exc.info["error"]["caused_by"]
@@ -945,8 +955,8 @@ class DatabaseLogic(BaseDatabaseLogic):
945
955
  "script": {
946
956
  "lang": "painless",
947
957
  "source": (
948
- f"""ctx._id = ctx._id.replace('{collection_id}', '{new_collection_id}');"""
949
- f"""ctx._source.collection = '{new_collection_id}';"""
958
+ f"""ctx._id = ctx._id.replace('{collection_id}', '{new_collection_id}');""" # noqa: E702
959
+ f"""ctx._source.collection = '{new_collection_id}';""" # noqa: E702
950
960
  ),
951
961
  },
952
962
  },
@@ -1000,9 +1010,9 @@ class DatabaseLogic(BaseDatabaseLogic):
1000
1010
  )
1001
1011
 
1002
1012
  try:
1003
- await self.client.delete(
1013
+ await self.client.delete_by_query(
1004
1014
  index=index_alias_by_collection_id(collection_id),
1005
- id=mk_item_id(item_id, collection_id),
1015
+ body={"query": {"term": {"_id": mk_item_id(item_id, collection_id)}}},
1006
1016
  refresh=refresh,
1007
1017
  )
1008
1018
  except exceptions.NotFoundError:
@@ -1093,8 +1103,10 @@ class DatabaseLogic(BaseDatabaseLogic):
1093
1103
  body=collection,
1094
1104
  refresh=refresh,
1095
1105
  )
1096
-
1097
- await create_item_index(collection_id)
1106
+ if self.async_index_inserter.should_create_collection_index():
1107
+ await self.async_index_inserter.create_simple_index(
1108
+ self.client, collection_id
1109
+ )
1098
1110
 
1099
1111
  async def find_collection(self, collection_id: str) -> Collection:
1100
1112
  """Find and return a collection from the database.
@@ -1303,6 +1315,7 @@ class DatabaseLogic(BaseDatabaseLogic):
1303
1315
  await self.client.delete(
1304
1316
  index=COLLECTIONS_INDEX, id=collection_id, refresh=refresh
1305
1317
  )
1318
+ # Delete the item index for the collection
1306
1319
  await delete_item_index(collection_id)
1307
1320
 
1308
1321
  async def bulk_async(
@@ -1356,9 +1369,13 @@ class DatabaseLogic(BaseDatabaseLogic):
1356
1369
  return 0, []
1357
1370
 
1358
1371
  raise_on_error = self.async_settings.raise_on_bulk_error
1372
+ actions = await self.async_index_inserter.prepare_bulk_actions(
1373
+ collection_id, processed_items
1374
+ )
1375
+
1359
1376
  success, errors = await helpers.async_bulk(
1360
1377
  self.client,
1361
- mk_actions(collection_id, processed_items),
1378
+ actions,
1362
1379
  refresh=refresh,
1363
1380
  raise_on_error=raise_on_error,
1364
1381
  )
@@ -1413,6 +1430,11 @@ class DatabaseLogic(BaseDatabaseLogic):
1413
1430
  f"Performing bulk insert for collection {collection_id} with refresh={refresh}"
1414
1431
  )
1415
1432
 
1433
+ # Handle empty processed_items
1434
+ if not processed_items:
1435
+ logger.warning(f"No items to insert for collection {collection_id}")
1436
+ return 0, []
1437
+
1416
1438
  # Handle empty processed_items
1417
1439
  if not processed_items:
1418
1440
  logger.warning(f"No items to insert for collection {collection_id}")
@@ -1,2 +1,2 @@
1
1
  """library version."""
2
- __version__ = "6.1.0"
2
+ __version__ = "6.2.0"
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.1
2
2
  Name: stac-fastapi-opensearch
3
- Version: 6.1.0
3
+ Version: 6.2.0
4
4
  Summary: Opensearch stac-fastapi backend.
5
5
  Home-page: https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch
6
6
  License: MIT
@@ -106,6 +106,7 @@ This project is built on the following technologies: STAC, stac-fastapi, FastAPI
106
106
  - [Auth](#auth)
107
107
  - [Aggregation](#aggregation)
108
108
  - [Rate Limiting](#rate-limiting)
109
+ - [Datetime-Based Index Management](#datetime-based-index-management)
109
110
 
110
111
  ## Documentation & Resources
111
112
 
@@ -251,6 +252,81 @@ You can customize additional settings in your `.env` file:
251
252
  > [!NOTE]
252
253
  > The variables `ES_HOST`, `ES_PORT`, `ES_USE_SSL`, `ES_VERIFY_CERTS` and `ES_TIMEOUT` apply to both Elasticsearch and OpenSearch backends, so there is no need to rename the key names to `OS_` even if you're using OpenSearch.
253
254
 
255
+ ## Datetime-Based Index Management
256
+
257
+ ### Overview
258
+
259
+ SFEOS supports two indexing strategies for managing STAC items:
260
+
261
+ 1. **Simple Indexing** (default) - One index per collection
262
+ 2. **Datetime-Based Indexing** - Time-partitioned indexes with automatic management
263
+
264
+ The datetime-based indexing strategy is particularly useful for large temporal datasets. When a user provides a datetime parameter in a query, the system knows exactly which index to search, providing **multiple times faster searches** and significantly **reducing database load**.
265
+
266
+ ### When to Use
267
+
268
+ **Recommended for:**
269
+ - Systems with large collections containing millions of items
270
+ - Systems requiring high-performance temporal searching
271
+
272
+ **Pros:**
273
+ - Multiple times faster queries with datetime filter
274
+ - Reduced database load - only relevant indexes are searched
275
+
276
+ **Cons:**
277
+ - Slightly longer item indexing time (automatic index management)
278
+ - Greater management complexity
279
+
280
+ ### Configuration
281
+
282
+ #### Enabling Datetime-Based Indexing
283
+
284
+ Enable datetime-based indexing by setting the following environment variable:
285
+
286
+ ```bash
287
+ ENABLE_DATETIME_INDEX_FILTERING=true
288
+ ```
289
+
290
+ ### Related Configuration Variables
291
+
292
+ | Variable | Description | Default | Example |
293
+ |----------|-------------|---------|---------|
294
+ | `ENABLE_DATETIME_INDEX_FILTERING` | Enables time-based index partitioning | `false` | `true` |
295
+ | `DATETIME_INDEX_MAX_SIZE_GB` | Maximum size limit for datetime indexes (GB) - note: add +20% to target size due to ES/OS compression | `25` | `50` |
296
+ | `STAC_ITEMS_INDEX_PREFIX` | Prefix for item indexes | `items_` | `stac_items_` |
297
+
298
+ ## How Datetime-Based Indexing Works
299
+
300
+ ### Index and Alias Naming Convention
301
+
302
+ The system uses a precise naming convention:
303
+
304
+ **Physical indexes:**
305
+ ```
306
+ {ITEMS_INDEX_PREFIX}{collection-id}_{uuid4}
307
+ ```
308
+
309
+ **Aliases:**
310
+ ```
311
+ {ITEMS_INDEX_PREFIX}{collection-id} # Main collection alias
312
+ {ITEMS_INDEX_PREFIX}{collection-id}_{start-datetime} # Temporal alias
313
+ {ITEMS_INDEX_PREFIX}{collection-id}_{start-datetime}_{end-datetime} # Closed index alias
314
+ ```
315
+
316
+ **Example:**
317
+
318
+ *Physical indexes:*
319
+ - `items_sentinel-2-l2a_a1b2c3d4-e5f6-7890-abcd-ef1234567890`
320
+
321
+ *Aliases:*
322
+ - `items_sentinel-2-l2a` - main collection alias
323
+ - `items_sentinel-2-l2a_2024-01-01` - active alias from January 1, 2024
324
+ - `items_sentinel-2-l2a_2024-01-01_2024-03-15` - closed index alias (reached size limit)
325
+
326
+ ### Index Size Management
327
+
328
+ **Important - Data Compression:** Elasticsearch and OpenSearch automatically compress data. The configured `DATETIME_INDEX_MAX_SIZE_GB` limit refers to the compressed size on disk. It is recommended to add +20% to the target size to account for compression overhead and metadata.
329
+
254
330
  ## Interacting with the API
255
331
 
256
332
  - **Creating a Collection**:
@@ -559,4 +635,3 @@ You can customize additional settings in your `.env` file:
559
635
  - Ensures fair resource allocation among all clients
560
636
 
561
637
  - **Examples**: Implementation examples are available in the [examples/rate_limit](examples/rate_limit) directory.
562
-
@@ -1,5 +1,5 @@
1
- stac-fastapi-core==6.1.0
2
- sfeos-helpers==6.1.0
1
+ stac-fastapi-core==6.2.0
2
+ sfeos-helpers==6.2.0
3
3
  opensearch-py~=2.8.0
4
4
  opensearch-py[async]~=2.8.0
5
5
  uvicorn~=0.23.0