sfeos-helpers 6.1.0__py3-none-any.whl → 6.2.0__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.1
2
2
  Name: sfeos-helpers
3
- Version: 6.1.0
3
+ Version: 6.2.0
4
4
  Summary: Helper library for the Elasticsearch and Opensearch stac-fastapi backends.
5
5
  Home-page: https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch
6
6
  License: MIT
@@ -15,7 +15,7 @@ Classifier: Programming Language :: Python :: 3.13
15
15
  Classifier: License :: OSI Approved :: MIT License
16
16
  Requires-Python: >=3.9
17
17
  Description-Content-Type: text/markdown
18
- Requires-Dist: stac-fastapi.core==6.1.0
18
+ Requires-Dist: stac-fastapi.core==6.2.0
19
19
 
20
20
  # stac-fastapi-elasticsearch-opensearch
21
21
 
@@ -104,6 +104,7 @@ This project is built on the following technologies: STAC, stac-fastapi, FastAPI
104
104
  - [Auth](#auth)
105
105
  - [Aggregation](#aggregation)
106
106
  - [Rate Limiting](#rate-limiting)
107
+ - [Datetime-Based Index Management](#datetime-based-index-management)
107
108
 
108
109
  ## Documentation & Resources
109
110
 
@@ -249,6 +250,81 @@ You can customize additional settings in your `.env` file:
249
250
  > [!NOTE]
250
251
  > The variables `ES_HOST`, `ES_PORT`, `ES_USE_SSL`, `ES_VERIFY_CERTS` and `ES_TIMEOUT` apply to both Elasticsearch and OpenSearch backends, so there is no need to rename the key names to `OS_` even if you're using OpenSearch.
251
252
 
253
+ ## Datetime-Based Index Management
254
+
255
+ ### Overview
256
+
257
+ SFEOS supports two indexing strategies for managing STAC items:
258
+
259
+ 1. **Simple Indexing** (default) - One index per collection
260
+ 2. **Datetime-Based Indexing** - Time-partitioned indexes with automatic management
261
+
262
+ The datetime-based indexing strategy is particularly useful for large temporal datasets. When a user provides a datetime parameter in a query, the system knows exactly which index to search, providing **multiple times faster searches** and significantly **reducing database load**.
263
+
264
+ ### When to Use
265
+
266
+ **Recommended for:**
267
+ - Systems with large collections containing millions of items
268
+ - Systems requiring high-performance temporal searching
269
+
270
+ **Pros:**
271
+ - Multiple times faster queries with datetime filter
272
+ - Reduced database load - only relevant indexes are searched
273
+
274
+ **Cons:**
275
+ - Slightly longer item indexing time (automatic index management)
276
+ - Greater management complexity
277
+
278
+ ### Configuration
279
+
280
+ #### Enabling Datetime-Based Indexing
281
+
282
+ Enable datetime-based indexing by setting the following environment variable:
283
+
284
+ ```bash
285
+ ENABLE_DATETIME_INDEX_FILTERING=true
286
+ ```
287
+
288
+ ### Related Configuration Variables
289
+
290
+ | Variable | Description | Default | Example |
291
+ |----------|-------------|---------|---------|
292
+ | `ENABLE_DATETIME_INDEX_FILTERING` | Enables time-based index partitioning | `false` | `true` |
293
+ | `DATETIME_INDEX_MAX_SIZE_GB` | Maximum size limit for datetime indexes (GB) - note: add +20% to target size due to ES/OS compression | `25` | `50` |
294
+ | `STAC_ITEMS_INDEX_PREFIX` | Prefix for item indexes | `items_` | `stac_items_` |
295
+
296
+ ## How Datetime-Based Indexing Works
297
+
298
+ ### Index and Alias Naming Convention
299
+
300
+ The system uses a precise naming convention:
301
+
302
+ **Physical indexes:**
303
+ ```
304
+ {ITEMS_INDEX_PREFIX}{collection-id}_{uuid4}
305
+ ```
306
+
307
+ **Aliases:**
308
+ ```
309
+ {ITEMS_INDEX_PREFIX}{collection-id} # Main collection alias
310
+ {ITEMS_INDEX_PREFIX}{collection-id}_{start-datetime} # Temporal alias
311
+ {ITEMS_INDEX_PREFIX}{collection-id}_{start-datetime}_{end-datetime} # Closed index alias
312
+ ```
313
+
314
+ **Example:**
315
+
316
+ *Physical indexes:*
317
+ - `items_sentinel-2-l2a_a1b2c3d4-e5f6-7890-abcd-ef1234567890`
318
+
319
+ *Aliases:*
320
+ - `items_sentinel-2-l2a` - main collection alias
321
+ - `items_sentinel-2-l2a_2024-01-01` - active alias from January 1, 2024
322
+ - `items_sentinel-2-l2a_2024-01-01_2024-03-15` - closed index alias (reached size limit)
323
+
324
+ ### Index Size Management
325
+
326
+ **Important - Data Compression:** Elasticsearch and OpenSearch automatically compress data. The configured `DATETIME_INDEX_MAX_SIZE_GB` limit refers to the compressed size on disk. It is recommended to add +20% to the target size to account for compression overhead and metadata.
327
+
252
328
  ## Interacting with the API
253
329
 
254
330
  - **Creating a Collection**:
@@ -557,4 +633,3 @@ You can customize additional settings in your `.env` file:
557
633
  - Ensures fair resource allocation among all clients
558
634
 
559
635
  - **Examples**: Implementation examples are available in the [examples/rate_limit](examples/rate_limit) directory.
560
-
@@ -0,0 +1,32 @@
1
+ stac_fastapi/sfeos_helpers/mappings.py,sha256=z6GJFJUE7bRKF9ODc8_ddkb7JCOokMtj4p2LeaQqrQQ,8237
2
+ stac_fastapi/sfeos_helpers/version.py,sha256=ro2d3oERQL2KxSo7qmbU0z6qT77XShwY6vsqrLf2VFw,45
3
+ stac_fastapi/sfeos_helpers/aggregation/__init__.py,sha256=Mym17lFh90by1GnoQgMyIKAqRNJnvCgVSXDYzjBiPQk,1210
4
+ stac_fastapi/sfeos_helpers/aggregation/client.py,sha256=PPUk0kAZnms46FlLGrR5w8wa52vG-dT6BG37896R5CY,17939
5
+ stac_fastapi/sfeos_helpers/aggregation/format.py,sha256=qUW1jjh2EEjy-V7riliFR77grpi-AgsTmP76z60K5Lo,2011
6
+ stac_fastapi/sfeos_helpers/database/__init__.py,sha256=T0YwePfhG3ukL1oUFCh3FYHA9jZZe36FJRYCQplfb18,2645
7
+ stac_fastapi/sfeos_helpers/database/datetime.py,sha256=XMyi9Q09cuP_hj97qbGbHFtelq7WQVPdehUfzqNZFV4,4040
8
+ stac_fastapi/sfeos_helpers/database/document.py,sha256=LtjX15gvaOuZC_k2t_oQhys_c-zRTLN5rwX0hNJkHnM,1725
9
+ stac_fastapi/sfeos_helpers/database/index.py,sha256=g7_sKfd5XUwq4IhdKRNiasejk045dKlullsdeDSZTq8,6585
10
+ stac_fastapi/sfeos_helpers/database/mapping.py,sha256=4-MSd4xH5wg7yoC4aPjzYMDSEvP026bw4k2TfffMT5E,1387
11
+ stac_fastapi/sfeos_helpers/database/query.py,sha256=k6aS20gnnjJzDynJ2AaPstpAPXEeWUEbPSkDffOf2To,4001
12
+ stac_fastapi/sfeos_helpers/database/utils.py,sha256=nPTWqP-pdSZumeYRx92XTN4xmmufRSqCDHaiSLdg-3o,7311
13
+ stac_fastapi/sfeos_helpers/filter/__init__.py,sha256=n3zL_MhEGOoxMz1KeijyK_UKiZ0MKPl90zHtYI5RAy8,1557
14
+ stac_fastapi/sfeos_helpers/filter/client.py,sha256=QwjYWXkevoVS7HPtoXfeSzDy-_GJnFhPJtJM49D14oU,4229
15
+ stac_fastapi/sfeos_helpers/filter/cql2.py,sha256=Cg9kRYD9CVkVSyRqOyB5oVXmlyteSn2bw88sqklGpUM,955
16
+ stac_fastapi/sfeos_helpers/filter/transform.py,sha256=1GEWQSp-rbq7_1nDVv1ApDbWxt8DswJWxwaxzV85gj4,4644
17
+ stac_fastapi/sfeos_helpers/models/patch.py,sha256=akzfF7vXI6AyGDrxgP8KqVwBvBDQayXYNNf4pm6Q8qM,4209
18
+ stac_fastapi/sfeos_helpers/search_engine/__init__.py,sha256=Bi0cAtul3FuLjFceTPtEcaWNBfmUX5vKaqDvbSUAm0o,754
19
+ stac_fastapi/sfeos_helpers/search_engine/base.py,sha256=9KOLW3NjW9PzWQzqLuhIjQU7FOHdDnB3ZNwDq469JZU,1400
20
+ stac_fastapi/sfeos_helpers/search_engine/factory.py,sha256=nPty3L8esypSVIzl5IKfmqQ1hVUIjMQ183Ksistr1bM,1066
21
+ stac_fastapi/sfeos_helpers/search_engine/index_operations.py,sha256=mYt1C2HEdwtslNwRHiZkvYTWQSZDwBBnKo5YJXfxnDo,5565
22
+ stac_fastapi/sfeos_helpers/search_engine/inserters.py,sha256=o-I_4OowMJetMwRFPdq8Oix_DAkMNGBw4fYyoa5W6s0,10562
23
+ stac_fastapi/sfeos_helpers/search_engine/managers.py,sha256=nldomKmw8iQfOxeGZbBRGG_rWk-vB5Hy_cOjJ2e0ArE,6454
24
+ stac_fastapi/sfeos_helpers/search_engine/selection/__init__.py,sha256=qKd4KzZkERwF_yhIeFcjAUnq5vQarr3CuXxE3SWmt6c,441
25
+ stac_fastapi/sfeos_helpers/search_engine/selection/base.py,sha256=106c4FK50cgMmTpPJkWdgbExPkU2yIH4Wq684Ww-fYE,859
26
+ stac_fastapi/sfeos_helpers/search_engine/selection/cache_manager.py,sha256=5yrgf9JA4mgRNMPDKih6xySF8mD724lEWnXhWud7m2c,4039
27
+ stac_fastapi/sfeos_helpers/search_engine/selection/factory.py,sha256=vbgNVCUW2lviePqzpgsPLxp6IEqcX3GHiahqN2oVObA,1305
28
+ stac_fastapi/sfeos_helpers/search_engine/selection/selectors.py,sha256=q83nfCfNfLUqtkHpORwNHNRU9Pa-heeaDIPO0RlHb-8,4779
29
+ sfeos_helpers-6.2.0.dist-info/METADATA,sha256=2CP3C0hpgotEerT4UZGsL16FCKtf--6Qm7zbNA8IE14,34297
30
+ sfeos_helpers-6.2.0.dist-info/WHEEL,sha256=tZoeGjtWxWRfdplE7E3d45VPlLNQnvbKiYnx7gwAy8A,92
31
+ sfeos_helpers-6.2.0.dist-info/top_level.txt,sha256=vqn-D9-HsRPTTxy0Vk_KkDmTiMES4owwBQ3ydSZYb2s,13
32
+ sfeos_helpers-6.2.0.dist-info/RECORD,,
@@ -313,9 +313,11 @@ class EsAsyncBaseAggregationClient(AsyncBaseAggregationClient):
313
313
  )
314
314
 
315
315
  if aggregate_request.datetime:
316
- search = self.database.apply_datetime_filter(
317
- search=search, interval=aggregate_request.datetime
316
+ search, datetime_search = self.database.apply_datetime_filter(
317
+ search=search, datetime=aggregate_request.datetime
318
318
  )
319
+ else:
320
+ datetime_search = {"gte": None, "lte": None}
319
321
 
320
322
  if aggregate_request.bbox:
321
323
  bbox = aggregate_request.bbox
@@ -414,6 +416,7 @@ class EsAsyncBaseAggregationClient(AsyncBaseAggregationClient):
414
416
  geometry_geohash_grid_precision,
415
417
  geometry_geotile_grid_precision,
416
418
  datetime_frequency_interval,
419
+ datetime_search,
417
420
  )
418
421
  except Exception as error:
419
422
  if not isinstance(error, IndexError):
@@ -30,11 +30,12 @@ Function Naming Conventions:
30
30
  """
31
31
 
32
32
  # Re-export all functions for backward compatibility
33
- from .datetime import return_date
33
+ from .datetime import extract_date, extract_first_date_from_index, return_date
34
34
  from .document import mk_actions, mk_item_id
35
35
  from .index import (
36
36
  create_index_templates_shared,
37
37
  delete_item_index_shared,
38
+ filter_indexes_by_datetime,
38
39
  index_alias_by_collection_id,
39
40
  index_by_collection_id,
40
41
  indices,
@@ -53,6 +54,7 @@ __all__ = [
53
54
  "delete_item_index_shared",
54
55
  "index_alias_by_collection_id",
55
56
  "index_by_collection_id",
57
+ "filter_indexes_by_datetime",
56
58
  "indices",
57
59
  # Query operations
58
60
  "apply_free_text_filter_shared",
@@ -68,4 +70,6 @@ __all__ = [
68
70
  "get_bool_env",
69
71
  # Datetime utilities
70
72
  "return_date",
73
+ "extract_date",
74
+ "extract_first_date_from_index",
71
75
  ]
@@ -4,14 +4,19 @@ This module provides datetime utility functions specifically designed for
4
4
  Elasticsearch and OpenSearch query formatting.
5
5
  """
6
6
 
7
+ import logging
8
+ import re
9
+ from datetime import date
7
10
  from datetime import datetime as datetime_type
8
11
  from typing import Dict, Optional, Union
9
12
 
10
13
  from stac_fastapi.types.rfc3339 import DateTimeType
11
14
 
15
+ logger = logging.getLogger(__name__)
16
+
12
17
 
13
18
  def return_date(
14
- interval: Optional[Union[DateTimeType, str]]
19
+ interval: Optional[Union[DateTimeType, str]],
15
20
  ) -> Dict[str, Optional[str]]:
16
21
  """
17
22
  Convert a date interval to an Elasticsearch/OpenSearch query format.
@@ -39,8 +44,14 @@ def return_date(
39
44
  if isinstance(interval, str):
40
45
  if "/" in interval:
41
46
  parts = interval.split("/")
42
- result["gte"] = parts[0] if parts[0] != ".." else None
43
- result["lte"] = parts[1] if len(parts) > 1 and parts[1] != ".." else None
47
+ result["gte"] = (
48
+ parts[0] if parts[0] != ".." else datetime_type.min.isoformat() + "Z"
49
+ )
50
+ result["lte"] = (
51
+ parts[1]
52
+ if len(parts) > 1 and parts[1] != ".."
53
+ else datetime_type.max.isoformat() + "Z"
54
+ )
44
55
  else:
45
56
  converted_time = interval if interval != ".." else None
46
57
  result["gte"] = result["lte"] = converted_time
@@ -58,3 +69,53 @@ def return_date(
58
69
  result["lte"] = end.strftime("%Y-%m-%dT%H:%M:%S.%f")[:-3] + "Z"
59
70
 
60
71
  return result
72
+
73
+
74
+ def extract_date(date_str: str) -> date:
75
+ """Extract date from ISO format string.
76
+
77
+ Args:
78
+ date_str: ISO format date string
79
+
80
+ Returns:
81
+ A date object extracted from the input string.
82
+ """
83
+ date_str = date_str.replace("Z", "+00:00")
84
+ return datetime_type.fromisoformat(date_str).date()
85
+
86
+
87
+ def extract_first_date_from_index(index_name: str) -> date:
88
+ """Extract the first date from an index name containing date patterns.
89
+
90
+ Searches for date patterns (YYYY-MM-DD) within the index name string
91
+ and returns the first found date as a date object.
92
+
93
+ Args:
94
+ index_name: Index name containing date patterns.
95
+
96
+ Returns:
97
+ A date object extracted from the first date pattern found in the index name.
98
+
99
+ """
100
+ date_pattern = r"\d{4}-\d{2}-\d{2}"
101
+ match = re.search(date_pattern, index_name)
102
+
103
+ if not match:
104
+ logger.error(f"No date pattern found in index name: '{index_name}'")
105
+ raise ValueError(
106
+ f"No date pattern (YYYY-MM-DD) found in index name: '{index_name}'"
107
+ )
108
+
109
+ date_string = match.group(0)
110
+
111
+ try:
112
+ extracted_date = datetime_type.strptime(date_string, "%Y-%m-%d").date()
113
+ return extracted_date
114
+ except ValueError as e:
115
+ logger.error(
116
+ f"Invalid date format found in index name '{index_name}': "
117
+ f"'{date_string}' - {str(e)}"
118
+ )
119
+ raise ValueError(
120
+ f"Invalid date format in index name '{index_name}': '{date_string}'"
121
+ ) from e
@@ -3,9 +3,13 @@
3
3
  This module provides functions for creating and managing indices in Elasticsearch/OpenSearch.
4
4
  """
5
5
 
6
+ import re
7
+ from datetime import datetime
6
8
  from functools import lru_cache
7
9
  from typing import Any, List, Optional
8
10
 
11
+ from dateutil.parser import parse # type: ignore[import]
12
+
9
13
  from stac_fastapi.sfeos_helpers.mappings import (
10
14
  _ES_INDEX_NAME_UNSUPPORTED_CHARS_TABLE,
11
15
  COLLECTIONS_INDEX,
@@ -66,6 +70,59 @@ def indices(collection_ids: Optional[List[str]]) -> str:
66
70
  )
67
71
 
68
72
 
73
+ def filter_indexes_by_datetime(
74
+ indexes: List[str], gte: Optional[str], lte: Optional[str]
75
+ ) -> List[str]:
76
+ """Filter indexes based on datetime range extracted from index names.
77
+
78
+ Args:
79
+ indexes: List of index names containing dates
80
+ gte: Greater than or equal date filter (ISO format, optional 'Z' suffix)
81
+ lte: Less than or equal date filter (ISO format, optional 'Z' suffix)
82
+
83
+ Returns:
84
+ List of filtered index names
85
+ """
86
+
87
+ def parse_datetime(dt_str: str) -> datetime:
88
+ """Parse datetime string, handling both with and without 'Z' suffix."""
89
+ return parse(dt_str).replace(tzinfo=None)
90
+
91
+ def extract_date_range_from_index(index_name: str) -> tuple:
92
+ """Extract start and end dates from index name."""
93
+ date_pattern = r"(\d{4}-\d{2}-\d{2})"
94
+ dates = re.findall(date_pattern, index_name)
95
+
96
+ if len(dates) == 1:
97
+ start_date = datetime.strptime(dates[0], "%Y-%m-%d")
98
+ max_date = datetime.max.replace(microsecond=0)
99
+ return start_date, max_date
100
+ else:
101
+ start_date = datetime.strptime(dates[0], "%Y-%m-%d")
102
+ end_date = datetime.strptime(dates[1], "%Y-%m-%d")
103
+ return start_date, end_date
104
+
105
+ def is_index_in_range(
106
+ start_date: datetime, end_date: datetime, gte_dt: datetime, lte_dt: datetime
107
+ ) -> bool:
108
+ """Check if index date range overlaps with filter range."""
109
+ return not (
110
+ end_date.date() < gte_dt.date() or start_date.date() > lte_dt.date()
111
+ )
112
+
113
+ gte_dt = parse_datetime(gte) if gte else datetime.min.replace(microsecond=0)
114
+ lte_dt = parse_datetime(lte) if lte else datetime.max.replace(microsecond=0)
115
+
116
+ filtered_indexes = []
117
+
118
+ for index in indexes:
119
+ start_date, end_date = extract_date_range_from_index(index)
120
+ if is_index_in_range(start_date, end_date, gte_dt, lte_dt):
121
+ filtered_indexes.append(index)
122
+
123
+ return filtered_indexes
124
+
125
+
69
126
  async def create_index_templates_shared(settings: Any) -> None:
70
127
  """Create index templates for Elasticsearch/OpenSearch Collection and Item indices.
71
128
 
@@ -120,11 +177,11 @@ async def delete_item_index_shared(settings: Any, collection_id: str) -> None:
120
177
  client = settings.create_client
121
178
 
122
179
  name = index_alias_by_collection_id(collection_id)
123
- resolved = await client.indices.resolve_index(name=name)
180
+ resolved = await client.indices.resolve_index(name=name, ignore=[404])
124
181
  if "aliases" in resolved and resolved["aliases"]:
125
182
  [alias] = resolved["aliases"]
126
183
  await client.indices.delete_alias(index=alias["indices"], name=alias["name"])
127
184
  await client.indices.delete(index=alias["indices"])
128
185
  else:
129
- await client.indices.delete(index=name)
186
+ await client.indices.delete(index=name, ignore=[404])
130
187
  await client.close()
@@ -0,0 +1,27 @@
1
+ """Search engine index management package."""
2
+
3
+ from .base import BaseIndexInserter
4
+ from .factory import IndexInsertionFactory
5
+ from .index_operations import IndexOperations
6
+ from .inserters import DatetimeIndexInserter, SimpleIndexInserter
7
+ from .managers import DatetimeIndexManager, IndexSizeManager
8
+ from .selection import (
9
+ BaseIndexSelector,
10
+ DatetimeBasedIndexSelector,
11
+ IndexSelectorFactory,
12
+ UnfilteredIndexSelector,
13
+ )
14
+
15
+ __all__ = [
16
+ "BaseIndexInserter",
17
+ "BaseIndexSelector",
18
+ "IndexOperations",
19
+ "IndexSizeManager",
20
+ "DatetimeIndexManager",
21
+ "DatetimeIndexInserter",
22
+ "SimpleIndexInserter",
23
+ "IndexInsertionFactory",
24
+ "DatetimeBasedIndexSelector",
25
+ "UnfilteredIndexSelector",
26
+ "IndexSelectorFactory",
27
+ ]
@@ -0,0 +1,51 @@
1
+ """Base classes for index inserters."""
2
+
3
+ from abc import ABC, abstractmethod
4
+ from typing import Any, Dict, List
5
+
6
+
7
+ class BaseIndexInserter(ABC):
8
+ """Base async index inserter with common async methods."""
9
+
10
+ @abstractmethod
11
+ async def get_target_index(
12
+ self, collection_id: str, product: Dict[str, Any]
13
+ ) -> str:
14
+ """Get target index for a product asynchronously.
15
+
16
+ Args:
17
+ collection_id (str): Collection identifier.
18
+ product (Dict[str, Any]): Product data.
19
+
20
+ Returns:
21
+ str: Target index name.
22
+ """
23
+ pass
24
+
25
+ @abstractmethod
26
+ async def prepare_bulk_actions(
27
+ self, collection_id: str, items: List[Dict[str, Any]]
28
+ ) -> List[Dict[str, Any]]:
29
+ """Prepare bulk actions for multiple items asynchronously.
30
+
31
+ Args:
32
+ collection_id (str): Collection identifier.
33
+ items (List[Dict[str, Any]]): List of items to process.
34
+
35
+ Returns:
36
+ List[Dict[str, Any]]: List of bulk actions.
37
+ """
38
+ pass
39
+
40
+ @abstractmethod
41
+ async def create_simple_index(self, client: Any, collection_id: str) -> str:
42
+ """Create a simple index asynchronously.
43
+
44
+ Args:
45
+ client: Search engine client instance.
46
+ collection_id (str): Collection identifier.
47
+
48
+ Returns:
49
+ str: Created index name.
50
+ """
51
+ pass
@@ -0,0 +1,36 @@
1
+ """Factory for creating index insertion strategies."""
2
+
3
+ from typing import Any
4
+
5
+ from stac_fastapi.core.utilities import get_bool_env
6
+
7
+ from .base import BaseIndexInserter
8
+ from .index_operations import IndexOperations
9
+ from .inserters import DatetimeIndexInserter, SimpleIndexInserter
10
+
11
+
12
+ class IndexInsertionFactory:
13
+ """Factory for creating index insertion strategies."""
14
+
15
+ @staticmethod
16
+ def create_insertion_strategy(
17
+ client: Any,
18
+ ) -> BaseIndexInserter:
19
+ """Create async insertion strategy based on configuration.
20
+
21
+ Args:
22
+ client: Async search engine client instance.
23
+
24
+ Returns:
25
+ BaseIndexInserter: Configured async insertion strategy.
26
+ """
27
+ index_operations = IndexOperations()
28
+
29
+ use_datetime_partitioning = get_bool_env(
30
+ "ENABLE_DATETIME_INDEX_FILTERING", default="false"
31
+ )
32
+
33
+ if use_datetime_partitioning:
34
+ return DatetimeIndexInserter(client, index_operations)
35
+ else:
36
+ return SimpleIndexInserter(index_operations, client)
@@ -0,0 +1,167 @@
1
+ """Search engine adapters for different implementations."""
2
+
3
+ import uuid
4
+ from typing import Any, Dict
5
+
6
+ from stac_fastapi.sfeos_helpers.database import (
7
+ index_alias_by_collection_id,
8
+ index_by_collection_id,
9
+ )
10
+ from stac_fastapi.sfeos_helpers.mappings import (
11
+ _ES_INDEX_NAME_UNSUPPORTED_CHARS_TABLE,
12
+ ES_ITEMS_MAPPINGS,
13
+ ES_ITEMS_SETTINGS,
14
+ ITEMS_INDEX_PREFIX,
15
+ )
16
+
17
+
18
+ class IndexOperations:
19
+ """Base class for search engine adapters with common implementations."""
20
+
21
+ async def create_simple_index(self, client: Any, collection_id: str) -> str:
22
+ """Create a simple index for the given collection.
23
+
24
+ Args:
25
+ client: Search engine client instance.
26
+ collection_id (str): Collection identifier.
27
+
28
+ Returns:
29
+ str: Created index name.
30
+ """
31
+ index_name = f"{index_by_collection_id(collection_id)}-000001"
32
+ alias_name = index_alias_by_collection_id(collection_id)
33
+
34
+ await client.indices.create(
35
+ index=index_name,
36
+ body=self._create_index_body({alias_name: {}}),
37
+ params={"ignore": [400]},
38
+ )
39
+ return index_name
40
+
41
+ async def create_datetime_index(
42
+ self, client: Any, collection_id: str, start_date: str
43
+ ) -> str:
44
+ """Create a datetime-based index for the given collection.
45
+
46
+ Args:
47
+ client: Search engine client instance.
48
+ collection_id (str): Collection identifier.
49
+ start_date (str): Start date for the alias.
50
+
51
+ Returns:
52
+ str: Created index alias name.
53
+ """
54
+ index_name = self.create_index_name(collection_id)
55
+ alias_name = self.create_alias_name(collection_id, start_date)
56
+ collection_alias = index_alias_by_collection_id(collection_id)
57
+ await client.indices.create(
58
+ index=index_name,
59
+ body=self._create_index_body({collection_alias: {}, alias_name: {}}),
60
+ )
61
+ return alias_name
62
+
63
+ @staticmethod
64
+ async def update_index_alias(client: Any, end_date: str, old_alias: str) -> str:
65
+ """Update index alias with new end date.
66
+
67
+ Args:
68
+ client: Search engine client instance.
69
+ end_date (str): End date for the alias.
70
+ old_alias (str): Current alias name.
71
+
72
+ Returns:
73
+ str: New alias name.
74
+ """
75
+ new_alias = f"{old_alias}-{end_date}"
76
+ aliases_info = await client.indices.get_alias(name=old_alias)
77
+ actions = []
78
+
79
+ for index_name in aliases_info.keys():
80
+ actions.append({"remove": {"index": index_name, "alias": old_alias}})
81
+ actions.append({"add": {"index": index_name, "alias": new_alias}})
82
+
83
+ await client.indices.update_aliases(body={"actions": actions})
84
+ return new_alias
85
+
86
+ @staticmethod
87
+ async def change_alias_name(client: Any, old_alias: str, new_alias: str) -> None:
88
+ """Change alias name from old to new.
89
+
90
+ Args:
91
+ client: Search engine client instance.
92
+ old_alias (str): Current alias name.
93
+ new_alias (str): New alias name.
94
+
95
+ Returns:
96
+ None
97
+ """
98
+ aliases_info = await client.indices.get_alias(name=old_alias)
99
+ actions = []
100
+
101
+ for index_name in aliases_info.keys():
102
+ actions.append({"remove": {"index": index_name, "alias": old_alias}})
103
+ actions.append({"add": {"index": index_name, "alias": new_alias}})
104
+ await client.indices.update_aliases(body={"actions": actions})
105
+
106
+ @staticmethod
107
+ def create_index_name(collection_id: str) -> str:
108
+ """Create index name from collection ID and uuid4.
109
+
110
+ Args:
111
+ collection_id (str): Collection identifier.
112
+
113
+ Returns:
114
+ str: Formatted index name.
115
+ """
116
+ cleaned = collection_id.translate(_ES_INDEX_NAME_UNSUPPORTED_CHARS_TABLE)
117
+ return f"{ITEMS_INDEX_PREFIX}{cleaned.lower()}_{uuid.uuid4()}"
118
+
119
+ @staticmethod
120
+ def create_alias_name(collection_id: str, start_date: str) -> str:
121
+ """Create index name from collection ID and uuid4.
122
+
123
+ Args:
124
+ collection_id (str): Collection identifier.
125
+ start_date (str): Start date for the alias.
126
+
127
+ Returns:
128
+ str: Alias name with initial date.
129
+ """
130
+ cleaned = collection_id.translate(_ES_INDEX_NAME_UNSUPPORTED_CHARS_TABLE)
131
+ return f"{ITEMS_INDEX_PREFIX}{cleaned.lower()}_{start_date}"
132
+
133
+ @staticmethod
134
+ def _create_index_body(aliases: Dict[str, Dict]) -> Dict[str, Any]:
135
+ """Create index body with common settings.
136
+
137
+ Args:
138
+ aliases (Dict[str, Dict]): Aliases configuration.
139
+
140
+ Returns:
141
+ Dict[str, Any]: Index body configuration.
142
+ """
143
+ return {
144
+ "aliases": aliases,
145
+ "mappings": ES_ITEMS_MAPPINGS,
146
+ "settings": ES_ITEMS_SETTINGS,
147
+ }
148
+
149
+ @staticmethod
150
+ async def find_latest_item_in_index(client: Any, index_name: str) -> dict[str, Any]:
151
+ """Find the latest item date in the specified index.
152
+
153
+ Args:
154
+ client: Search engine client instance.
155
+ index_name (str): Name of the index to query.
156
+
157
+ Returns:
158
+ datetime: Date of the latest item in the index.
159
+ """
160
+ query = {
161
+ "size": 1,
162
+ "sort": [{"properties.datetime": {"order": "desc"}}],
163
+ "_source": ["properties.datetime"],
164
+ }
165
+
166
+ response = await client.search(index=index_name, body=query)
167
+ return response["hits"]["hits"][0]