sfeos-helpers 6.5.1__py3-none-any.whl → 6.7.0__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,714 +0,0 @@
1
- Metadata-Version: 2.1
2
- Name: sfeos-helpers
3
- Version: 6.5.1
4
- Summary: Helper library for the Elasticsearch and Opensearch stac-fastapi backends.
5
- Home-page: https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch
6
- License: MIT
7
- Classifier: Intended Audience :: Developers
8
- Classifier: Intended Audience :: Information Technology
9
- Classifier: Intended Audience :: Science/Research
10
- Classifier: Programming Language :: Python :: 3.9
11
- Classifier: Programming Language :: Python :: 3.10
12
- Classifier: Programming Language :: Python :: 3.11
13
- Classifier: Programming Language :: Python :: 3.12
14
- Classifier: Programming Language :: Python :: 3.13
15
- Classifier: License :: OSI Approved :: MIT License
16
- Requires-Python: >=3.9
17
- Description-Content-Type: text/markdown
18
- Requires-Dist: stac-fastapi.core==6.5.1
19
-
20
- # stac-fastapi-elasticsearch-opensearch
21
-
22
- <!-- markdownlint-disable MD033 MD041 -->
23
-
24
-
25
- <p align="left">
26
- <img src="https://raw.githubusercontent.com/stac-utils/stac-fastapi-elasticsearch-opensearch/refs/heads/main/assets/sfeos.png" width=1000>
27
- </p>
28
-
29
- **Jump to:** [Project Introduction](#project-introduction---what-is-sfeos) | [Quick Start](#quick-start) | [Table of Contents](#table-of-contents)
30
-
31
- [![Downloads](https://static.pepy.tech/badge/stac-fastapi-core?color=blue)](https://pepy.tech/project/stac-fastapi-core)
32
- [![GitHub contributors](https://img.shields.io/github/contributors/stac-utils/stac-fastapi-elasticsearch-opensearch?color=blue)](https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch/graphs/contributors)
33
- [![GitHub stars](https://img.shields.io/github/stars/stac-utils/stac-fastapi-elasticsearch-opensearch.svg?color=blue)](https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch/stargazers)
34
- [![GitHub forks](https://img.shields.io/github/forks/stac-utils/stac-fastapi-elasticsearch-opensearch.svg?color=blue)](https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch/network/members)
35
- [![PyPI version](https://img.shields.io/pypi/v/stac-fastapi-elasticsearch.svg?color=blue)](https://pypi.org/project/stac-fastapi-elasticsearch/)
36
- [![STAC](https://img.shields.io/badge/STAC-1.1.0-blue.svg)](https://github.com/radiantearth/stac-spec/tree/v1.1.0)
37
- [![stac-fastapi](https://img.shields.io/badge/stac--fastapi-6.0.0-blue.svg)](https://github.com/stac-utils/stac-fastapi)
38
-
39
- ## Sponsors & Supporters
40
-
41
- The following organizations have contributed time and/or funding to support the development of this project:
42
-
43
- <p align="left">
44
- <a href="https://healy-hyperspatial.github.io/"><img src="https://raw.githubusercontent.com/stac-utils/stac-fastapi-elasticsearch-opensearch/refs/heads/main/assets/hh-logo-blue.png" alt="Healy Hyperspatial" height="100" hspace="20"></a>
45
- <a href="https://atomicmaps.io/"><img src="https://raw.githubusercontent.com/stac-utils/stac-fastapi-elasticsearch-opensearch/refs/heads/main/assets/am-logo-black.png" alt="Atomic Maps" height="100" hspace="20"></a>
46
- <a href="https://remotesensing.vito.be/"><img src="https://raw.githubusercontent.com/stac-utils/stac-fastapi-elasticsearch-opensearch/refs/heads/main/assets/VITO.png" alt="VITO Remote Sensing" height="100" hspace="20"></a>
47
- </p>
48
-
49
- ## Project Introduction - What is SFEOS?
50
-
51
- SFEOS (stac-fastapi-elasticsearch-opensearch) is a high-performance, scalable API implementation for serving SpatioTemporal Asset Catalog (STAC) data - an enhanced GeoJSON format designed specifically for geospatial assets like satellite imagery, aerial photography, and other Earth observation data. This project enables organizations to:
52
-
53
- - **Efficiently catalog and search geospatial data** such as satellite imagery, aerial photography, DEMs, and other geospatial assets using Elasticsearch or OpenSearch as the database backend
54
- - **Implement standardized STAC APIs** that support complex spatial, temporal, and property-based queries across large collections of geospatial data
55
- - **Scale to millions of geospatial assets** with fast search performance through optimized spatial indexing and query capabilities
56
- - **Support OGC-compliant filtering** including spatial operations (intersects, contains, etc.) and temporal queries
57
- - **Perform geospatial aggregations** to analyze data distribution across space and time
58
- - **Enhanced collection search capabilities** with support for sorting and field selection
59
-
60
- This implementation builds on the STAC-FastAPI framework, providing a production-ready solution specifically optimized for Elasticsearch and OpenSearch databases. It's ideal for organizations managing large geospatial data catalogs who need efficient discovery and access capabilities through standardized APIs.
61
-
62
- ## Common Deployment Patterns
63
-
64
- stac-fastapi-elasticsearch-opensearch can be deployed in several ways depending on your needs:
65
-
66
- - **Containerized Application**: Run as a Docker container with connections to Elasticsearch/OpenSearch databases
67
- - **Serverless Function**: Deploy as AWS Lambda or similar serverless function with API Gateway
68
- - **Traditional Server**: Run on virtual machines or bare metal servers in your infrastructure
69
- - **Kubernetes**: Deploy as part of a larger microservices architecture with container orchestration
70
-
71
- The implementation is flexible and can scale from small local deployments to large production environments serving millions of geospatial assets.
72
-
73
- ## Technologies
74
-
75
- This project is built on the following technologies: STAC, stac-fastapi, FastAPI, Elasticsearch, Python, OpenSearch
76
-
77
- <p align="left">
78
- <a href="https://stacspec.org/"><img src="https://raw.githubusercontent.com/stac-utils/stac-fastapi-elasticsearch-opensearch/refs/heads/main/assets/STAC-01.png" alt="STAC" height="100" hspace="10"></a>
79
- <a href="https://www.python.org/"><img src="https://raw.githubusercontent.com/stac-utils/stac-fastapi-elasticsearch-opensearch/refs/heads/main/assets/python.png" alt="Python" height="80" hspace="10"></a>
80
- <a href="https://fastapi.tiangolo.com/"><img src="https://raw.githubusercontent.com/stac-utils/stac-fastapi-elasticsearch-opensearch/refs/heads/main/assets/fastapi.svg" alt="FastAPI" height="80" hspace="10"></a>
81
- <a href="https://www.elastic.co/"><img src="https://raw.githubusercontent.com/stac-utils/stac-fastapi-elasticsearch-opensearch/refs/heads/main/assets/elasticsearch.png" alt="Elasticsearch" height="80" hspace="10"></a>
82
- <a href="https://opensearch.org/"><img src="https://raw.githubusercontent.com/stac-utils/stac-fastapi-elasticsearch-opensearch/refs/heads/main/assets/opensearch.svg" alt="OpenSearch" height="80" hspace="10"></a>
83
- </p>
84
-
85
- ## Table of Contents
86
-
87
- - [stac-fastapi-elasticsearch-opensearch](#stac-fastapi-elasticsearch-opensearch)
88
- - [Sponsors & Supporters](#sponsors--supporters)
89
- - [Project Introduction - What is SFEOS?](#project-introduction---what-is-sfeos)
90
- - [Common Deployment Patterns](#common-deployment-patterns)
91
- - [Technologies](#technologies)
92
- - [Table of Contents](#table-of-contents)
93
- - [Collection Search Extensions](#collection-search-extensions)
94
- - [Documentation & Resources](#documentation--resources)
95
- - [Package Structure](#package-structure)
96
- - [Examples](#examples)
97
- - [Performance](#performance)
98
- - [Direct Response Mode](#direct-response-mode)
99
- - [Quick Start](#quick-start)
100
- - [Installation](#installation)
101
- - [Running Locally](#running-locally)
102
- - [Using Pre-built Docker Images](#using-pre-built-docker-images)
103
- - [Using Docker Compose](#using-docker-compose)
104
- - [Configuration Reference](#configuration-reference)
105
- - [Datetime-Based Index Management](#datetime-based-index-management)
106
- - [Overview](#overview)
107
- - [When to Use](#when-to-use)
108
- - [Configuration](#configuration)
109
- - [Enabling Datetime-Based Indexing](#enabling-datetime-based-indexing)
110
- - [Related Configuration Variables](#related-configuration-variables)
111
- - [How Datetime-Based Indexing Works](#how-datetime-based-indexing-works)
112
- - [Index and Alias Naming Convention](#index-and-alias-naming-convention)
113
- - [Index Size Management](#index-size-management)
114
- - [Interacting with the API](#interacting-with-the-api)
115
- - [Configure the API](#configure-the-api)
116
- - [Collection Pagination](#collection-pagination)
117
- - [Ingesting Sample Data CLI Tool](#ingesting-sample-data-cli-tool)
118
- - [Elasticsearch Mappings](#elasticsearch-mappings)
119
- - [Managing Elasticsearch Indices](#managing-elasticsearch-indices)
120
- - [Snapshots](#snapshots)
121
- - [Reindexing](#reindexing)
122
- - [Auth](#auth)
123
- - [Aggregation](#aggregation)
124
- - [Rate Limiting](#rate-limiting)
125
-
126
- ## Documentation & Resources
127
-
128
- - **Online Documentation**: [https://stac-utils.github.io/stac-fastapi-elasticsearch-opensearch](https://stac-utils.github.io/stac-fastapi-elasticsearch-opensearch/)
129
- - **Source Code**: [https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch](https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch)
130
- - **API Examples**: [Postman Documentation](https://documenter.getpostman.com/view/12888943/2s8ZDSdRHA) - Examples of how to use the API endpoints
131
- - **Community**:
132
- - [Gitter Chat](https://app.gitter.im/#/room/#stac-fastapi-elasticsearch_community:gitter.im) - For real-time discussions
133
- - [GitHub Discussions](https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch/discussions) - For longer-form questions and answers
134
-
135
- ## Collection Search Extensions
136
-
137
- SFEOS provides enhanced collection search capabilities through two primary routes:
138
- - **GET/POST `/collections`**: The standard STAC endpoint with extended query parameters
139
- - **GET/POST `/collections-search`**: A custom endpoint that supports the same parameters, created to avoid conflicts with the STAC Transactions extension if enabled (which uses POST `/collections` for collection creation)
140
-
141
- These endpoints support advanced collection discovery features including:
142
-
143
- - **Sorting**: Sort collections by sortable fields using the `sortby` parameter
144
- - Example: `/collections?sortby=+id` (ascending sort by ID)
145
- - Example: `/collections?sortby=-id` (descending sort by ID)
146
- - Example: `/collections?sortby=-temporal` (descending sort by temporal extent)
147
-
148
- - **Field Selection**: Request only specific fields to be returned using the `fields` parameter
149
- - Example: `/collections?fields=id,title,description`
150
- - This helps reduce payload size when only certain fields are needed
151
-
152
- - **Free Text Search**: Search across collection text fields using the `q` parameter
153
- - Example: `/collections?q=landsat`
154
- - Searches across multiple text fields including title, description, and keywords
155
- - Supports partial word matching and relevance-based sorting
156
-
157
- - **Structured Filtering**: Filter collections using CQL2 expressions
158
- - JSON format: `/collections?filter={"op":"=","args":[{"property":"id"},"sentinel-2"]}&filter-lang=cql2-json`
159
- - Text format: `/collections?filter=id='sentinel-2'&filter-lang=cql2-text` (note: string values must be quoted)
160
- - Advanced text format: `/collections?filter=id LIKE '%sentinel%'&filter-lang=cql2-text` (supports LIKE, BETWEEN, etc.)
161
- - Supports both CQL2 JSON and CQL2 text formats with various operators
162
- - Enables precise filtering on any collection property
163
-
164
- - **Datetime Filtering**: Filter collections by their temporal extent using the `datetime` parameter
165
- - Example: `/collections?datetime=2020-01-01T00:00:00Z/2020-12-31T23:59:59Z` (finds collections with temporal extents that overlap this range)
166
- - Example: `/collections?datetime=2020-06-15T12:00:00Z` (finds collections whose temporal extent includes this specific time)
167
- - Example: `/collections?datetime=2020-01-01T00:00:00Z/..` (finds collections with temporal extents that extend to or beyond January 1, 2020)
168
- - Example: `/collections?datetime=../2020-12-31T23:59:59Z` (finds collections with temporal extents that begin on or before December 31, 2020)
169
- - Collections are matched if their temporal extent overlaps with the provided datetime parameter
170
- - This allows for efficient discovery of collections based on time periods
171
-
172
- These extensions make it easier to build user interfaces that display and navigate through collections efficiently.
173
-
174
- > **Configuration**: Collection search extensions (sorting, field selection, free text search, structured filtering, and datetime filtering) for the `/collections` endpoint can be disabled by setting the `ENABLE_COLLECTIONS_SEARCH` environment variable to `false`. By default, these extensions are enabled.
175
- >
176
- > **Configuration**: The custom `/collections-search` endpoint can be enabled by setting the `ENABLE_COLLECTIONS_SEARCH_ROUTE` environment variable to `true`. By default, this endpoint is **disabled**.
177
-
178
- > **Note**: Sorting is only available on fields that are indexed for sorting in Elasticsearch/OpenSearch. With the default mappings, you can sort on:
179
- > - `id` (keyword field)
180
- > - `extent.temporal.interval` (date field)
181
- > - `temporal` (alias to extent.temporal.interval)
182
- >
183
- > Text fields like `title` and `description` are not sortable by default as they use text analysis for better search capabilities. Attempting to sort on these fields will result in a user-friendly error message explaining which fields are sortable and how to make additional fields sortable by updating the mappings.
184
- >
185
- > **Important**: Adding keyword fields to make text fields sortable can significantly increase the index size, especially for large text fields. Consider the storage implications when deciding which fields to make sortable.
186
-
187
-
188
- ## Package Structure
189
-
190
- This project is organized into several packages, each with a specific purpose:
191
-
192
- - **stac_fastapi_core**: Core functionality that's database-agnostic, including API models, extensions, and shared utilities. This package provides the foundation for building STAC API implementations with any database backend. See [stac-fastapi-mongo](https://github.com/Healy-Hyperspatial/stac-fastapi-mongo) for a working example.
193
-
194
- - **sfeos_helpers**: Shared helper functions and utilities used by both the Elasticsearch and OpenSearch backends. This package includes:
195
- - `database`: Specialized modules for index, document, and database utility operations
196
- - `aggregation`: Elasticsearch/OpenSearch-specific aggregation functionality
197
- - Shared logic and utilities that improve code reuse between backends
198
-
199
- - **stac_fastapi_elasticsearch**: Complete implementation of the STAC API using Elasticsearch as the backend database. This package depends on both `stac_fastapi_core` and `sfeos_helpers`.
200
-
201
- - **stac_fastapi_opensearch**: Complete implementation of the STAC API using OpenSearch as the backend database. This package depends on both `stac_fastapi_core` and `sfeos_helpers`.
202
-
203
- ## Examples
204
-
205
- The `/examples` directory contains several useful examples and reference implementations:
206
-
207
- - **pip_docker**: Examples of running stac-fastapi-elasticsearch from PyPI in Docker without needing any code from the repository
208
- - **auth**: Authentication examples including:
209
- - Basic authentication
210
- - OAuth2 with Keycloak
211
- - Route dependencies configuration
212
- - **rate_limit**: Example of implementing rate limiting for API requests
213
- - **postman_collections**: Postman collection files you can import for testing API endpoints
214
-
215
- These examples provide practical reference implementations for various deployment scenarios and features.
216
-
217
- ## Performance
218
-
219
- ### Direct Response Mode
220
-
221
- - The `enable_direct_response` option is provided by the stac-fastapi core library (introduced in stac-fastapi 5.2.0) and is available in this project starting from v4.0.0.
222
- - **Control via environment variable**: Set `ENABLE_DIRECT_RESPONSE=true` to enable this feature.
223
- - **How it works**: When enabled, endpoints return Starlette Response objects directly, bypassing FastAPI's default serialization for improved performance.
224
- - **Important limitation**: All FastAPI dependencies (including authentication, custom status codes, and validation) are disabled for all routes when this mode is enabled.
225
- - **Best use case**: This mode is best suited for public or read-only APIs where authentication and custom logic are not required.
226
- - **Default setting**: `false` for safety.
227
- - **More information**: See [issue #347](https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch/issues/347) for background and implementation details.
228
-
229
- ## Quick Start
230
-
231
- This section helps you get up and running with stac-fastapi-elasticsearch-opensearch quickly.
232
-
233
- ### Installation
234
-
235
- - **For versions 4.0.0a1 and newer** (PEP 625 compliant naming):
236
- ```bash
237
- pip install stac-fastapi-elasticsearch # Elasticsearch backend
238
- pip install stac-fastapi-opensearch # Opensearch backend
239
- pip install stac-fastapi-core # Core library
240
- ```
241
-
242
- - **For versions 4.0.0a0 and older**:
243
- ```bash
244
- pip install stac-fastapi.elasticsearch # Elasticsearch backend
245
- pip install stac-fastapi.opensearch # Opensearch backend
246
- pip install stac-fastapi.core # Core library
247
- ```
248
-
249
- > **Important Note:** Starting with version 4.0.0a1, package names have changed from using periods (e.g., `stac-fastapi.core`) to using hyphens (e.g., `stac-fastapi-core`) to comply with PEP 625. The internal package structure uses underscores, but users should install with hyphens as shown above. Please update your requirements files accordingly.
250
-
251
- ### Running Locally
252
-
253
- There are two main ways to run the API locally:
254
-
255
- #### Using Pre-built Docker Images
256
-
257
- - We provide ready-to-use Docker images through GitHub Container Registry:
258
- - [ElasticSearch backend](https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch/pkgs/container/stac-fastapi-es)
259
- - [OpenSearch backend](https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch/pkgs/container/stac-fastapi-os)
260
-
261
- - **Pull and run the images**:
262
- ```shell
263
- # For Elasticsearch backend
264
- docker pull ghcr.io/stac-utils/stac-fastapi-es:latest
265
-
266
- # For OpenSearch backend
267
- docker pull ghcr.io/stac-utils/stac-fastapi-os:latest
268
- ```
269
-
270
- #### Using Docker Compose
271
-
272
- - **Prerequisites**: Ensure [Docker Compose](https://docs.docker.com/compose/install/) or [Podman Compose](https://podman-desktop.io/docs/compose) is installed on your machine.
273
-
274
- - **Start the API**:
275
- ```shell
276
- docker compose up elasticsearch app-elasticsearch
277
- ```
278
-
279
- - **Configuration**: By default, Docker Compose uses Elasticsearch 8.x and OpenSearch 2.11.1. To use different versions, create a `.env` file:
280
- ```shell
281
- ELASTICSEARCH_VERSION=8.11.0
282
- OPENSEARCH_VERSION=2.11.1
283
- ENABLE_DIRECT_RESPONSE=false
284
- ```
285
-
286
- - **Compatibility**: The most recent Elasticsearch 7.x versions should also work. See the [opensearch-py docs](https://github.com/opensearch-project/opensearch-py/blob/main/COMPATIBILITY.md) for compatibility information.
287
-
288
-
289
-
290
- ## Configuration Reference
291
-
292
- You can customize additional settings in your `.env` file:
293
-
294
- | Variable | Description | Default | Required |
295
- |------------------------------|--------------------------------------------------------------------------------------|--------------------------|---------------------------------------------------------------------------------------------|
296
- | `ES_HOST` | Hostname for external Elasticsearch/OpenSearch. | `localhost` | Optional |
297
- | `ES_PORT` | Port for Elasticsearch/OpenSearch. | `9200` (ES) / `9202` (OS)| Optional |
298
- | `ES_USE_SSL` | Use SSL for connecting to Elasticsearch/OpenSearch. | `true` | Optional |
299
- | `ES_VERIFY_CERTS` | Verify SSL certificates when connecting. | `true` | Optional |
300
- | `ES_API_KEY` | API Key for external Elasticsearch/OpenSearch. | N/A | Optional |
301
- | `ES_TIMEOUT` | Client timeout for Elasticsearch/OpenSearch. | DB client default | Optional |
302
- | `STAC_FASTAPI_TITLE` | Title of the API in the documentation. | `stac-fastapi-<backend>` | Optional |
303
- | `STAC_FASTAPI_DESCRIPTION` | Description of the API in the documentation. | N/A | Optional |
304
- | `STAC_FASTAPI_VERSION` | API version. | `2.1` | Optional |
305
- | `STAC_FASTAPI_LANDING_PAGE_ID` | Landing page ID | `stac-fastapi` | Optional |
306
- | `APP_HOST` | Server bind address. | `0.0.0.0` | Optional |
307
- | `APP_PORT` | Server port. | `8000` | Optional |
308
- | `ENVIRONMENT` | Runtime environment. | `local` | Optional |
309
- | `WEB_CONCURRENCY` | Number of worker processes. | `10` | Optional |
310
- | `RELOAD` | Enable auto-reload for development. | `true` | Optional |
311
- | `STAC_FASTAPI_RATE_LIMIT` | API rate limit per client. | `200/minute` | Optional |
312
- | `BACKEND` | Tests-related variable | `elasticsearch` or `opensearch` based on the backend | Optional |
313
- | `ELASTICSEARCH_VERSION` | Version of Elasticsearch to use. | `8.11.0` | Optional |
314
- | `OPENSEARCH_VERSION` | OpenSearch version | `2.11.1` | Optional |
315
- | `ENABLE_DIRECT_RESPONSE` | Enable direct response for maximum performance (disables all FastAPI dependencies, including authentication, custom status codes, and validation) | `false` | Optional |
316
- | `RAISE_ON_BULK_ERROR` | Controls whether bulk insert operations raise exceptions on errors. If set to `true`, the operation will stop and raise an exception when an error occurs. If set to `false`, errors will be logged, and the operation will continue. **Note:** STAC Item and ItemCollection validation errors will always raise, regardless of this flag. | `false` | Optional |
317
- | `DATABASE_REFRESH` | Controls whether database operations refresh the index immediately after changes. If set to `true`, changes will be immediately searchable. If set to `false`, changes may not be immediately visible but can improve performance for bulk operations. If set to `wait_for`, changes will wait for the next refresh cycle to become visible. | `false` | Optional |
318
- | `ENABLE_COLLECTIONS_SEARCH` | Enable collection search extensions (sort, fields, free text search, structured filtering, and datetime filtering) on the core `/collections` endpoint. | `true` | Optional |
319
- | `ENABLE_COLLECTIONS_SEARCH_ROUTE` | Enable the custom `/collections-search` endpoint (both GET and POST methods). When disabled, the custom endpoint will not be available, but collection search extensions will still be available on the core `/collections` endpoint if `ENABLE_COLLECTIONS_SEARCH` is true. | `false` | Optional |
320
- | `ENABLE_TRANSACTIONS_EXTENSIONS` | Enables or disables the Transactions and Bulk Transactions API extensions. This is useful for deployments where mutating the catalog via the API should be prevented. If set to `true`, the POST `/collections` route for search will be unavailable in the API. | `true` | Optional |
321
- | `STAC_ITEM_LIMIT` | Sets the environment variable for result limiting to SFEOS for the number of returned items and STAC collections. | `10` | Optional |
322
- | `STAC_INDEX_ASSETS` | Controls if Assets are indexed when added to Elasticsearch/Opensearch. This allows asset fields to be included in search queries. | `false` | Optional |
323
- | `ENV_MAX_LIMIT` | Configures the environment variable in SFEOS to override the default `MAX_LIMIT`, which controls the limit parameter for returned items and STAC collections. | `10,000` | Optional |
324
- | `USE_DATETIME` | Configures the datetime search behavior in SFEOS. When enabled, searches both datetime field and falls back to start_datetime/end_datetime range for items with null datetime. When disabled, searches only by start_datetime/end_datetime range. | `true` | Optional |
325
-
326
- > [!NOTE]
327
- > The variables `ES_HOST`, `ES_PORT`, `ES_USE_SSL`, `ES_VERIFY_CERTS` and `ES_TIMEOUT` apply to both Elasticsearch and OpenSearch backends, so there is no need to rename the key names to `OS_` even if you're using OpenSearch.
328
-
329
- ## Datetime-Based Index Management
330
-
331
- ### Overview
332
-
333
- SFEOS supports two indexing strategies for managing STAC items:
334
-
335
- 1. **Simple Indexing** (default) - One index per collection
336
- 2. **Datetime-Based Indexing** - Time-partitioned indexes with automatic management
337
-
338
- The datetime-based indexing strategy is particularly useful for large temporal datasets. When a user provides a datetime parameter in a query, the system knows exactly which index to search, providing **multiple times faster searches** and significantly **reducing database load**.
339
-
340
- ### When to Use
341
-
342
- **Recommended for:**
343
- - Systems with large collections containing millions of items
344
- - Systems requiring high-performance temporal searching
345
-
346
- **Pros:**
347
- - Multiple times faster queries with datetime filter
348
- - Reduced database load - only relevant indexes are searched
349
-
350
- **Cons:**
351
- - Slightly longer item indexing time (automatic index management)
352
- - Greater management complexity
353
-
354
- ### Configuration
355
-
356
- #### Enabling Datetime-Based Indexing
357
-
358
- Enable datetime-based indexing by setting the following environment variable:
359
-
360
- ```bash
361
- ENABLE_DATETIME_INDEX_FILTERING=true
362
- ```
363
-
364
- ### Related Configuration Variables
365
-
366
- | Variable | Description | Default | Example |
367
- |----------|-------------|---------|---------|
368
- | `ENABLE_DATETIME_INDEX_FILTERING` | Enables time-based index partitioning | `false` | `true` |
369
- | `DATETIME_INDEX_MAX_SIZE_GB` | Maximum size limit for datetime indexes (GB) - note: add +20% to target size due to ES/OS compression | `25` | `50` |
370
- | `STAC_ITEMS_INDEX_PREFIX` | Prefix for item indexes | `items_` | `stac_items_` |
371
-
372
- ## How Datetime-Based Indexing Works
373
-
374
- ### Index and Alias Naming Convention
375
-
376
- The system uses a precise naming convention:
377
-
378
- **Physical indexes:**
379
- ```
380
- {ITEMS_INDEX_PREFIX}{collection-id}_{uuid4}
381
- ```
382
-
383
- **Aliases:**
384
- ```
385
- {ITEMS_INDEX_PREFIX}{collection-id} # Main collection alias
386
- {ITEMS_INDEX_PREFIX}{collection-id}_{start-datetime} # Temporal alias
387
- {ITEMS_INDEX_PREFIX}{collection-id}_{start-datetime}_{end-datetime} # Closed index alias
388
- ```
389
-
390
- **Example:**
391
-
392
- *Physical indexes:*
393
- - `items_sentinel-2-l2a_a1b2c3d4-e5f6-7890-abcd-ef1234567890`
394
-
395
- *Aliases:*
396
- - `items_sentinel-2-l2a` - main collection alias
397
- - `items_sentinel-2-l2a_2024-01-01` - active alias from January 1, 2024
398
- - `items_sentinel-2-l2a_2024-01-01_2024-03-15` - closed index alias (reached size limit)
399
-
400
- ### Index Size Management
401
-
402
- **Important - Data Compression:** Elasticsearch and OpenSearch automatically compress data. The configured `DATETIME_INDEX_MAX_SIZE_GB` limit refers to the compressed size on disk. It is recommended to add +20% to the target size to account for compression overhead and metadata.
403
-
404
- ## Interacting with the API
405
-
406
- - **Creating a Collection**:
407
- ```shell
408
- curl -X "POST" "http://localhost:8080/collections" \
409
- -H 'Content-Type: application/json; charset=utf-8' \
410
- -d $'{
411
- "id": "my_collection"
412
- }'
413
- ```
414
-
415
- - **Adding an Item to a Collection**:
416
- ```shell
417
- curl -X "POST" "http://localhost:8080/collections/my_collection/items" \
418
- -H 'Content-Type: application/json; charset=utf-8' \
419
- -d @item.json
420
- ```
421
-
422
- - **Searching for Items**:
423
- ```shell
424
- curl -X "GET" "http://localhost:8080/search" \
425
- -H 'Content-Type: application/json; charset=utf-8' \
426
- -d $'{
427
- "collections": ["my_collection"],
428
- "limit": 10
429
- }'
430
- ```
431
-
432
- - **Filtering by Bbox**:
433
- ```shell
434
- curl -X "GET" "http://localhost:8080/search" \
435
- -H 'Content-Type: application/json; charset=utf-8' \
436
- -d $'{
437
- "collections": ["my_collection"],
438
- "bbox": [-180, -90, 180, 90]
439
- }'
440
- ```
441
-
442
- - **Filtering by Datetime**:
443
- ```shell
444
- curl -X "GET" "http://localhost:8080/search" \
445
- -H 'Content-Type: application/json; charset=utf-8' \
446
- -d $'{
447
- "collections": ["my_collection"],
448
- "datetime": "2020-01-01T00:00:00Z/2020-12-31T23:59:59Z"
449
- }'
450
- ```
451
-
452
- ## Configure the API
453
-
454
- - **API Title and Description**: By default set to `stac-fastapi-<backend>`. Customize these by setting:
455
- - `STAC_FASTAPI_TITLE`: Changes the API title in the documentation
456
- - `STAC_FASTAPI_DESCRIPTION`: Changes the API description in the documentation
457
-
458
- - **Database Indices**: By default, the API reads from and writes to:
459
- - `collections` index for collections
460
- - `items_<collection name>` indices for items
461
- - Customize with `STAC_COLLECTIONS_INDEX` and `STAC_ITEMS_INDEX_PREFIX` environment variables
462
-
463
- - **Root Path Configuration**: The application root path is the base URL by default.
464
- - For AWS Lambda with Gateway API: Set `STAC_FASTAPI_ROOT_PATH` to match the Gateway API stage name (e.g., `/v1`)
465
-
466
- - **Feature Configuration**: Control which features are enabled:
467
- - `ENABLE_COLLECTIONS_SEARCH`: Set to `true` (default) to enable collection search extensions (sort, fields). Set to `false` to disable.
468
- - `ENABLE_TRANSACTIONS_EXTENSIONS`: Set to `true` (default) to enable transaction extensions. Set to `false` to disable.
469
-
470
- ## Collection Pagination
471
-
472
- - **Overview**: The collections route supports pagination through optional query parameters.
473
- - **Parameters**:
474
- - `limit`: Controls the number of collections returned per page
475
- - `token`: Used to retrieve subsequent pages of results
476
- - **Response Structure**: The `links` field in the response contains a `next` link with the token for the next page of results.
477
- - **Example Usage**:
478
- ```shell
479
- curl -X "GET" "http://localhost:8080/collections?limit=1&token=example_token"
480
- ```
481
-
482
- ## Ingesting Sample Data CLI Tool
483
-
484
- - **Overview**: The `data_loader.py` script provides a convenient way to load STAC items into the database.
485
-
486
- - **Usage**:
487
- ```shell
488
- python3 data_loader.py --base-url http://localhost:8080
489
- ```
490
-
491
- - **Options**:
492
- ```
493
- --base-url TEXT Base URL of the STAC API [required]
494
- --collection-id TEXT ID of the collection to which items are added
495
- --use-bulk Use bulk insert method for items
496
- --data-dir PATH Directory containing collection.json and feature
497
- collection file
498
- --help Show this message and exit.
499
- ```
500
-
501
- - **Example Workflows**:
502
- - **Loading Sample Data**:
503
- ```shell
504
- python3 data_loader.py --base-url http://localhost:8080
505
- ```
506
- - **Loading Data to a Specific Collection**:
507
- ```shell
508
- python3 data_loader.py --base-url http://localhost:8080 --collection-id my-collection
509
- ```
510
- - **Using Bulk Insert for Performance**:
511
- ```shell
512
- python3 data_loader.py --base-url http://localhost:8080 --use-bulk
513
- ```
514
-
515
- ## Elasticsearch Mappings
516
-
517
- - **Overview**: Mappings apply to search index, not source data. They define how documents and their fields are stored and indexed.
518
- - **Implementation**:
519
- - Mappings are stored in index templates that are created on application startup
520
- - These templates are automatically applied when creating new Collection and Item indices
521
- - The `sfeos_helpers` package contains shared mapping definitions used by both Elasticsearch and OpenSearch backends
522
- - **Customization**: Custom mappings can be defined by extending the base mapping templates.
523
-
524
- ## Managing Elasticsearch Indices
525
-
526
- ### Snapshots
527
-
528
- - **Overview**: Snapshots provide a way to backup and restore your indices.
529
-
530
- - **Creating a Snapshot Repository**:
531
- ```shell
532
- curl -X "PUT" "http://localhost:9200/_snapshot/my_fs_backup" \
533
- -H 'Content-Type: application/json; charset=utf-8' \
534
- -d $'{
535
- "type": "fs",
536
- "settings": {
537
- "location": "/usr/share/elasticsearch/snapshots/my_fs_backup"
538
- }
539
- }'
540
- ```
541
- - This creates a snapshot repository that stores files in the elasticsearch/snapshots directory in this git repo clone
542
- - The elasticsearch.yml and compose files create a mapping from that directory to /usr/share/elasticsearch/snapshots within the Elasticsearch container and grant permissions for using it
543
-
544
- - **Creating a Snapshot**:
545
- ```shell
546
- curl -X "PUT" "http://localhost:9200/_snapshot/my_fs_backup/my_snapshot_2?wait_for_completion=true" \
547
- -H 'Content-Type: application/json; charset=utf-8' \
548
- -d $'{
549
- "metadata": {
550
- "taken_because": "dump of all items",
551
- "taken_by": "pvarner"
552
- },
553
- "include_global_state": false,
554
- "ignore_unavailable": false,
555
- "indices": "items_my-collection"
556
- }'
557
- ```
558
- - This creates a snapshot named my_snapshot_2 and waits for the action to be completed before returning
559
- - This can also be done asynchronously by omitting the wait_for_completion parameter, and queried for status later
560
- - The indices parameter determines which indices are snapshotted, and can include wildcards
561
-
562
- - **Viewing Snapshots**:
563
- ```shell
564
- # View a specific snapshot
565
- curl http://localhost:9200/_snapshot/my_fs_backup/my_snapshot_2
566
-
567
- # View all snapshots
568
- curl http://localhost:9200/_snapshot/my_fs_backup/_all
569
- ```
570
- - These commands allow you to check the status and details of your snapshots
571
-
572
- - **Restoring a Snapshot**:
573
- ```shell
574
- curl -X "POST" "http://localhost:9200/_snapshot/my_fs_backup/my_snapshot_2/_restore?wait_for_completion=true" \
575
- -H 'Content-Type: application/json; charset=utf-8' \
576
- -d $'{
577
- "include_aliases": false,
578
- "include_global_state": false,
579
- "ignore_unavailable": true,
580
- "rename_replacement": "items_$1-copy",
581
- "indices": "items_*",
582
- "rename_pattern": "items_(.+)"
583
- }'
584
- ```
585
- - This specific command will restore any indices that match items_* and rename them so that the new index name will be suffixed with -copy
586
- - The rename_pattern and rename_replacement parameters allow you to restore indices under new names
587
-
588
- - **Updating Collection References**:
589
- ```shell
590
- curl -X "POST" "http://localhost:9200/items_my-collection-copy/_update_by_query" \
591
- -H 'Content-Type: application/json; charset=utf-8' \
592
- -d $'{
593
- "query": {
594
- "match_all": {}
595
- },
596
- "script": {
597
- "lang": "painless",
598
- "params": {
599
- "collection": "my-collection-copy"
600
- },
601
- "source": "ctx._source.collection = params.collection"
602
- }
603
- }'
604
- ```
605
- - After restoring, the item documents have been restored in the new index (e.g., my-collection-copy), but the value of the collection field in those documents is still the original value of my-collection
606
- - This command updates these values to match the new collection name using Elasticsearch's Update By Query feature
607
-
608
- - **Creating a New Collection**:
609
- ```shell
610
- curl -X "POST" "http://localhost:8080/collections" \
611
- -H 'Content-Type: application/json' \
612
- -d $'{
613
- "id": "my-collection-copy"
614
- }'
615
- ```
616
- - The final step is to create a new collection through the API with the new name for each of the restored indices
617
- - This gives you a copy of the collection that has a resource URI (/collections/my-collection-copy) and can be correctly queried by collection name
618
-
619
- ### Reindexing
620
-
621
- - **Overview**: Reindexing allows you to copy documents from one index to another, optionally transforming them in the process.
622
-
623
- - **Use Cases**:
624
- - Apply changes to documents
625
- - Correct dynamically generated mappings
626
- - Transform data (e.g., lowercase identifiers)
627
- - The index templates will make sure that manually created indices will also have the correct mappings and settings
628
-
629
- - **Example: Reindexing with Transformation**:
630
- ```shell
631
- curl -X "POST" "http://localhost:9200/_reindex" \
632
- -H 'Content-Type: application/json' \
633
- -d $'{
634
- "source": {
635
- "index": "items_my-collection-lower_my-collection-hex-000001"
636
- },
637
- "dest": {
638
- "index": "items_my-collection-lower_my-collection-hex-000002"
639
- },
640
- "script": {
641
- "source": "ctx._source.id = ctx._source.id.toLowerCase()",
642
- "lang": "painless"
643
- }
644
- }'
645
- ```
646
- - In this example, we make a copy of an existing Item index but change the Item identifier to be lowercase
647
- - The script parameter allows you to transform documents during the reindexing process
648
-
649
- - **Updating Aliases**:
650
- ```shell
651
- curl -X "POST" "http://localhost:9200/_aliases" \
652
- -H 'Content-Type: application/json' \
653
- -d $'{
654
- "actions": [
655
- {
656
- "remove": {
657
- "index": "*",
658
- "alias": "items_my-collection"
659
- }
660
- },
661
- {
662
- "add": {
663
- "index": "items_my-collection-lower_my-collection-hex-000002",
664
- "alias": "items_my-collection"
665
- }
666
- }
667
- ]
668
- }'
669
- ```
670
- - If you are happy with the data in the newly created index, you can move the alias items_my-collection to the new index
671
- - This makes the modified Items with lowercase identifiers visible to users accessing my-collection in the STAC API
672
- - Using aliases allows you to switch between different index versions without changing the API endpoint
673
-
674
- ## Auth
675
-
676
- - **Overview**: Authentication is an optional feature that can be enabled through Route Dependencies.
677
- - **Implementation Options**:
678
- - Basic authentication
679
- - OAuth2 with Keycloak
680
- - Custom route dependencies
681
- - **Configuration**: Authentication can be configured using the `STAC_FASTAPI_ROUTE_DEPENDENCIES` environment variable.
682
- - **Examples and Documentation**: Detailed examples and implementation guides can be found in the [examples/auth](examples/auth) directory.
683
-
684
- ## Aggregation
685
-
686
- - **Supported Aggregations**:
687
- - Spatial aggregations of points and geometries
688
- - Frequency distribution aggregation of any property including dates
689
- - Temporal distribution of datetime values
690
-
691
- - **Endpoint Locations**:
692
- - Root Catalog level: `/aggregations`
693
- - Collection level: `/<collection_id>/aggregations`
694
-
695
- - **Implementation Details**: The `sfeos_helpers.aggregation` package provides specialized functionality for both Elasticsearch and OpenSearch backends.
696
-
697
- - **Documentation**: Detailed information about supported aggregations can be found in [the aggregation docs](./docs/src/aggregation.md).
698
-
699
-
700
- ## Rate Limiting
701
-
702
- - **Overview**: Rate limiting is an optional security feature that controls API request frequency on a remote address basis.
703
-
704
- - **Configuration**: Enabled by setting the `STAC_FASTAPI_RATE_LIMIT` environment variable:
705
- ```
706
- STAC_FASTAPI_RATE_LIMIT=500/minute
707
- ```
708
-
709
- - **Functionality**:
710
- - Limits each client to a specified number of requests per time period (e.g., 500 requests per minute)
711
- - Helps prevent API abuse and maintains system stability
712
- - Ensures fair resource allocation among all clients
713
-
714
- - **Examples**: Implementation examples are available in the [examples/rate_limit](examples/rate_limit) directory.