You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: CHANGELOG.md
+53-1Lines changed: 53 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,6 +8,56 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.
8
8
9
9
## [Unreleased]
10
10
11
+
### Added
12
+
13
+
### Changed
14
+
15
+
- Changed assets serialization to prevent mapping explosion while allowing asset inforamtion to be indexed. [#341](https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch/pull/341)
16
+
17
+
### Fixed
18
+
19
+
## [v6.2.1] - 2025-09-02
20
+
21
+
### Added
22
+
23
+
- Added `id` field as secondary sort to sort config to ensure unique pagination tokens. [#421](https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch/pull/421)
24
+
- Added default environment variable `STAC_ITEM_LIMIT` to SFEOS for result limiting of returned items and STAC collections [#419](https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch/pull/419)
25
+
26
+
### Changed
27
+
28
+
- Simplified Patch class and updated patch script creation including adding nest creation for merge patch [#420](https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch/pull/420)
29
+
30
+
## [v6.2.0] - 2025-08-27
31
+
32
+
### Added
33
+
34
+
- Added comprehensive index management system with dynamic selection and insertion strategies for improved performance and scalability [#405](https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch/pull/405)
35
+
- Added `ENABLE_DATETIME_INDEX_FILTERING` environment variable to enable datetime-based index selection using collection IDs. When enabled, the system creates indexes with UUID-based names and manages them through time-based aliases. Default is `false`. [#405](https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch/pull/405)
36
+
- Added `DATETIME_INDEX_MAX_SIZE_GB` environment variable to set maximum size limit in GB for datetime-based indexes. When an index exceeds this size, a new time-partitioned index will be created. Note: add +20% to target size due to ES/OS compression. Default is `25` GB. Only applies when `ENABLE_DATETIME_INDEX_FILTERING` is enabled. [#405](https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch/pull/405)
37
+
- Added index operations system with unified interface for both Elasticsearch and OpenSearch [#405](https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch/pull/405):
38
+
-`IndexOperations` class with common index creation and management methods
39
+
- UUID-based physical index naming: `{prefix}_{collection-id}_{uuid4}`
40
+
- Alias management: main collection alias, temporal aliases, and closed index aliases
41
+
- Automatic alias updates when indexes reach size limits
42
+
- Added datetime-based index selection strategies with caching support [#405](https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch/pull/405):
43
+
-`DatetimeBasedIndexSelector` for temporal filtering with intelligent caching
44
+
-`IndexCacheManager` with configurable TTL-based cache expiration (default 1 hour)
45
+
-`IndexAliasLoader` for alias management and cache refresh
46
+
-`UnfilteredIndexSelector` as fallback for returning all available indexes
47
+
- Added index insertion strategies with automatic partitioning [#405](https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch/pull/405):
48
+
- Simple insertion strategy (`SimpleIndexInserter`) for traditional single-index-per-collection approach
49
+
- Datetime-based insertion strategy (`DatetimeIndexInserter`) with time-based partitioning
50
+
- Automatic index size monitoring and splitting when limits exceeded
51
+
- Handling of chronologically early data and bulk operations
52
+
- Added index management utilities [#405](https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch/pull/405):
53
+
-`IndexSizeManager` for size monitoring and overflow handling with compression awareness
54
+
-`DatetimeIndexManager` for datetime-based index operations and validation
55
+
- Factory patterns (`IndexInsertionFactory`, `IndexSelectorFactory`) for strategy creation based on configuration
56
+
57
+
### Changed
58
+
59
+
- Added the Datetime-Based Index Management section to the Table of Contents in the readme, updating heading sizes to match the rest of the document [#418](https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch/pull/418)
60
+
11
61
## [v6.1.0] - 2025-07-24
12
62
13
63
### Added
@@ -442,7 +492,9 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.
442
492
- Use genexp in execute_search and get_all_collections to return results.
443
493
- Added db_to_stac serializer to item_collection method in core.py.
Copy file name to clipboardExpand all lines: README.md
+77-1Lines changed: 77 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -85,6 +85,7 @@ This project is built on the following technologies: STAC, stac-fastapi, FastAPI
85
85
-[Auth](#auth)
86
86
-[Aggregation](#aggregation)
87
87
-[Rate Limiting](#rate-limiting)
88
+
-[Datetime-Based Index Management](#datetime-based-index-management)
88
89
89
90
## Documentation & Resources
90
91
@@ -226,10 +227,86 @@ You can customize additional settings in your `.env` file:
226
227
|`RAISE_ON_BULK_ERROR`| Controls whether bulk insert operations raise exceptions on errors. If set to `true`, the operation will stop and raise an exception when an error occurs. If set to `false`, errors will be logged, and the operation will continue. **Note:** STAC Item and ItemCollection validation errors will always raise, regardless of this flag. |`false`| Optional |
227
228
|`DATABASE_REFRESH`| Controls whether database operations refresh the index immediately after changes. If set to `true`, changes will be immediately searchable. If set to `false`, changes may not be immediately visible but can improve performance for bulk operations. If set to `wait_for`, changes will wait for the next refresh cycle to become visible. |`false`| Optional |
228
229
|`ENABLE_TRANSACTIONS_EXTENSIONS`| Enables or disables the Transactions and Bulk Transactions API extensions. If set to `false`, the POST `/collections` route and related transaction endpoints (including bulk transaction operations) will be unavailable in the API. This is useful for deployments where mutating the catalog via the API should be prevented. |`true`| Optional |
230
+
|`STAC_ITEM_LIMIT`| Sets the environment variable for result limiting to SFEOS for the number of returned items and STAC collections. |`10`| Optional |
229
231
230
232
> [!NOTE]
231
233
> The variables `ES_HOST`, `ES_PORT`, `ES_USE_SSL`, `ES_VERIFY_CERTS` and `ES_TIMEOUT` apply to both Elasticsearch and OpenSearch backends, so there is no need to rename the key names to `OS_` even if you're using OpenSearch.
232
234
235
+
## Datetime-Based Index Management
236
+
237
+
### Overview
238
+
239
+
SFEOS supports two indexing strategies for managing STAC items:
240
+
241
+
1.**Simple Indexing** (default) - One index per collection
242
+
2.**Datetime-Based Indexing** - Time-partitioned indexes with automatic management
243
+
244
+
The datetime-based indexing strategy is particularly useful for large temporal datasets. When a user provides a datetime parameter in a query, the system knows exactly which index to search, providing **multiple times faster searches** and significantly **reducing database load**.
245
+
246
+
### When to Use
247
+
248
+
**Recommended for:**
249
+
- Systems with large collections containing millions of items
250
+
- Systems requiring high-performance temporal searching
251
+
252
+
**Pros:**
253
+
- Multiple times faster queries with datetime filter
254
+
- Reduced database load - only relevant indexes are searched
255
+
256
+
**Cons:**
257
+
- Slightly longer item indexing time (automatic index management)
258
+
- Greater management complexity
259
+
260
+
### Configuration
261
+
262
+
#### Enabling Datetime-Based Indexing
263
+
264
+
Enable datetime-based indexing by setting the following environment variable:
265
+
266
+
```bash
267
+
ENABLE_DATETIME_INDEX_FILTERING=true
268
+
```
269
+
270
+
### Related Configuration Variables
271
+
272
+
| Variable | Description | Default | Example |
273
+
|----------|-------------|---------|---------|
274
+
|`ENABLE_DATETIME_INDEX_FILTERING`| Enables time-based index partitioning |`false`|`true`|
275
+
|`DATETIME_INDEX_MAX_SIZE_GB`| Maximum size limit for datetime indexes (GB) - note: add +20% to target size due to ES/OS compression |`25`|`50`|
276
+
|`STAC_ITEMS_INDEX_PREFIX`| Prefix for item indexes |`items_`|`stac_items_`|
277
+
278
+
## How Datetime-Based Indexing Works
279
+
280
+
### Index and Alias Naming Convention
281
+
282
+
The system uses a precise naming convention:
283
+
284
+
**Physical indexes:**
285
+
```
286
+
{ITEMS_INDEX_PREFIX}{collection-id}_{uuid4}
287
+
```
288
+
289
+
**Aliases:**
290
+
```
291
+
{ITEMS_INDEX_PREFIX}{collection-id} # Main collection alias
292
+
{ITEMS_INDEX_PREFIX}{collection-id}_{start-datetime} # Temporal alias
293
+
{ITEMS_INDEX_PREFIX}{collection-id}_{start-datetime}_{end-datetime} # Closed index alias
-`items_sentinel-2-l2a_2024-01-01` - active alias from January 1, 2024
304
+
-`items_sentinel-2-l2a_2024-01-01_2024-03-15` - closed index alias (reached size limit)
305
+
306
+
### Index Size Management
307
+
308
+
**Important - Data Compression:** Elasticsearch and OpenSearch automatically compress data. The configured `DATETIME_INDEX_MAX_SIZE_GB` limit refers to the compressed size on disk. It is recommended to add +20% to the target size to account for compression overhead and metadata.
309
+
233
310
## Interacting with the API
234
311
235
312
-**Creating a Collection**:
@@ -538,4 +615,3 @@ You can customize additional settings in your `.env` file:
538
615
- Ensures fair resource allocation among all clients
539
616
540
617
- **Examples**: Implementation examples are available in the [examples/rate_limit](examples/rate_limit) directory.
0 commit comments