Skip to content

Commit 05adf13

Browse files
YuriZmytrakovYuri Zmytrakov
authored andcommitted
Merge branch 'main' into CAT-1382
2 parents 2a8dd32 + 0988448 commit 05adf13

File tree

16 files changed

+580
-113
lines changed

16 files changed

+580
-113
lines changed

.pre-commit-config.yaml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,8 @@ repos:
3131
]
3232
additional_dependencies: [
3333
"types-attrs",
34-
"types-requests"
34+
"types-requests",
35+
"types-redis"
3536
]
3637
- repo: https://github.com/PyCQA/pydocstyle
3738
rev: 6.1.1

CHANGELOG.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,8 +9,13 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.
99

1010
### Added
1111

12+
- GET `/collections` collection search free text extension ex. `/collections?q=sentinel`. [#470](https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch/pull/470)
1213
- Added `USE_DATETIME` environment variable to configure datetime search behavior in SFEOS. [#452](https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch/pull/452)
1314
- GET `/collections` collection search sort extension ex. `/collections?sortby=+id`. [#456](https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch/pull/456)
15+
- GET `/collections` collection search fields extension ex. `/collections?fields=id,title`. [#465](https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch/pull/465)
16+
- Improved error messages for sorting on unsortable fields in collection search, including guidance on how to make fields sortable. [#465](https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch/pull/465)
17+
- Added field alias for `temporal` to enable easier sorting by temporal extent, alongside `extent.temporal.interval`. [#465](https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch/pull/465)
18+
- Added `ENABLE_COLLECTIONS_SEARCH` environment variable to make collection search extensions optional (defaults to enabled). [#465](https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch/pull/465)
1419

1520
### Changed
1621

Makefile

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -63,22 +63,22 @@ docker-shell-os:
6363

6464
.PHONY: test-elasticsearch
6565
test-elasticsearch:
66-
-$(run_es) /bin/bash -c 'pip install redis==6.4.0 export && ./scripts/wait-for-it-es.sh elasticsearch:9200 && cd stac_fastapi/tests/ && pytest'
66+
-$(run_es) /bin/bash -c 'export && ./scripts/wait-for-it-es.sh elasticsearch:9200 && cd stac_fastapi/tests/ && pytest'
6767
docker compose down
6868

6969
.PHONY: test-opensearch
7070
test-opensearch:
71-
-$(run_os) /bin/bash -c 'pip install redis==6.4.0 export && ./scripts/wait-for-it-es.sh opensearch:9202 && cd stac_fastapi/tests/ && pytest'
71+
-$(run_os) /bin/bash -c 'export && ./scripts/wait-for-it-es.sh opensearch:9202 && cd stac_fastapi/tests/ && pytest'
7272
docker compose down
7373

7474
.PHONY: test-datetime-filtering-es
7575
test-datetime-filtering-es:
76-
-$(run_es) /bin/bash -c 'pip install redis==6.4.0 && export ENABLE_DATETIME_INDEX_FILTERING=true && ./scripts/wait-for-it-es.sh elasticsearch:9200 && cd stac_fastapi/tests/ && pytest -s --cov=stac_fastapi --cov-report=term-missing -m datetime_filtering'
76+
-$(run_es) /bin/bash -c 'export ENABLE_DATETIME_INDEX_FILTERING=true && ./scripts/wait-for-it-es.sh elasticsearch:9200 && cd stac_fastapi/tests/ && pytest -s --cov=stac_fastapi --cov-report=term-missing -m datetime_filtering'
7777
docker compose down
7878

7979
.PHONY: test-datetime-filtering-os
8080
test-datetime-filtering-os:
81-
-$(run_os) /bin/bash -c 'pip install redis==6.4.0 && export ENABLE_DATETIME_INDEX_FILTERING=true && ./scripts/wait-for-it-es.sh opensearch:9202 && cd stac_fastapi/tests/ && pytest -s --cov=stac_fastapi --cov-report=term-missing -m datetime_filtering'
81+
-$(run_os) /bin/bash -c 'export ENABLE_DATETIME_INDEX_FILTERING=true && ./scripts/wait-for-it-es.sh opensearch:9202 && cd stac_fastapi/tests/ && pytest -s --cov=stac_fastapi --cov-report=term-missing -m datetime_filtering'
8282
docker compose down
8383

8484
.PHONY: test

README.md

Lines changed: 38 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -36,11 +36,10 @@ SFEOS (stac-fastapi-elasticsearch-opensearch) is a high-performance, scalable AP
3636
- **Scale to millions of geospatial assets** with fast search performance through optimized spatial indexing and query capabilities
3737
- **Support OGC-compliant filtering** including spatial operations (intersects, contains, etc.) and temporal queries
3838
- **Perform geospatial aggregations** to analyze data distribution across space and time
39+
- **Enhanced collection search capabilities** with support for sorting and field selection
3940

4041
This implementation builds on the STAC-FastAPI framework, providing a production-ready solution specifically optimized for Elasticsearch and OpenSearch databases. It's ideal for organizations managing large geospatial data catalogs who need efficient discovery and access capabilities through standardized APIs.
4142

42-
43-
4443
## Common Deployment Patterns
4544

4645
stac-fastapi-elasticsearch-opensearch can be deployed in several ways depending on your needs:
@@ -72,6 +71,7 @@ This project is built on the following technologies: STAC, stac-fastapi, FastAPI
7271
- [Common Deployment Patterns](#common-deployment-patterns)
7372
- [Technologies](#technologies)
7473
- [Table of Contents](#table-of-contents)
74+
- [Collection Search Extensions](#collection-search-extensions)
7575
- [Documentation \& Resources](#documentation--resources)
7676
- [Package Structure](#package-structure)
7777
- [Examples](#examples)
@@ -113,6 +113,37 @@ This project is built on the following technologies: STAC, stac-fastapi, FastAPI
113113
- [Gitter Chat](https://app.gitter.im/#/room/#stac-fastapi-elasticsearch_community:gitter.im) - For real-time discussions
114114
- [GitHub Discussions](https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch/discussions) - For longer-form questions and answers
115115

116+
## Collection Search Extensions
117+
118+
SFEOS implements extended capabilities for the `/collections` endpoint, allowing for more powerful collection discovery:
119+
120+
- **Sorting**: Sort collections by sortable fields using the `sortby` parameter
121+
- Example: `/collections?sortby=+id` (ascending sort by ID)
122+
- Example: `/collections?sortby=-id` (descending sort by ID)
123+
- Example: `/collections?sortby=-temporal` (descending sort by temporal extent)
124+
125+
- **Field Selection**: Request only specific fields to be returned using the `fields` parameter
126+
- Example: `/collections?fields=id,title,description`
127+
- This helps reduce payload size when only certain fields are needed
128+
129+
- **Free Text Search**: Search across collection text fields using the `q` parameter
130+
- Example: `/collections?q=landsat`
131+
- Searches across multiple text fields including title, description, and keywords
132+
- Supports partial word matching and relevance-based sorting
133+
134+
These extensions make it easier to build user interfaces that display and navigate through collections efficiently.
135+
136+
> **Configuration**: Collection search extensions can be disabled by setting the `ENABLE_COLLECTIONS_SEARCH` environment variable to `false`. By default, these extensions are enabled.
137+
138+
> **Note**: Sorting is only available on fields that are indexed for sorting in Elasticsearch/OpenSearch. With the default mappings, you can sort on:
139+
> - `id` (keyword field)
140+
> - `extent.temporal.interval` (date field)
141+
> - `temporal` (alias to extent.temporal.interval)
142+
>
143+
> Text fields like `title` and `description` are not sortable by default as they use text analysis for better search capabilities. Attempting to sort on these fields will result in a user-friendly error message explaining which fields are sortable and how to make additional fields sortable by updating the mappings.
144+
>
145+
> **Important**: Adding keyword fields to make text fields sortable can significantly increase the index size, especially for large text fields. Consider the storage implications when deciding which fields to make sortable.
146+
116147
## Package Structure
117148

118149
This project is organized into several packages, each with a specific purpose:
@@ -243,6 +274,7 @@ You can customize additional settings in your `.env` file:
243274
| `ENABLE_DIRECT_RESPONSE` | Enable direct response for maximum performance (disables all FastAPI dependencies, including authentication, custom status codes, and validation) | `false` | Optional |
244275
| `RAISE_ON_BULK_ERROR` | Controls whether bulk insert operations raise exceptions on errors. If set to `true`, the operation will stop and raise an exception when an error occurs. If set to `false`, errors will be logged, and the operation will continue. **Note:** STAC Item and ItemCollection validation errors will always raise, regardless of this flag. | `false` | Optional |
245276
| `DATABASE_REFRESH` | Controls whether database operations refresh the index immediately after changes. If set to `true`, changes will be immediately searchable. If set to `false`, changes may not be immediately visible but can improve performance for bulk operations. If set to `wait_for`, changes will wait for the next refresh cycle to become visible. | `false` | Optional |
277+
| `ENABLE_COLLECTIONS_SEARCH` | Enable collection search extensions (sort, fields). | `true` | Optional |
246278
| `ENABLE_TRANSACTIONS_EXTENSIONS` | Enables or disables the Transactions and Bulk Transactions API extensions. If set to `false`, the POST `/collections` route and related transaction endpoints (including bulk transaction operations) will be unavailable in the API. This is useful for deployments where mutating the catalog via the API should be prevented. | `true` | Optional |
247279
| `STAC_ITEM_LIMIT` | Sets the environment variable for result limiting to SFEOS for the number of returned items and STAC collections. | `10` | Optional |
248280
| `STAC_INDEX_ASSETS` | Controls if Assets are indexed when added to Elasticsearch/Opensearch. This allows asset fields to be included in search queries. | `false` | Optional |
@@ -389,6 +421,10 @@ The system uses a precise naming convention:
389421
- **Root Path Configuration**: The application root path is the base URL by default.
390422
- For AWS Lambda with Gateway API: Set `STAC_FASTAPI_ROOT_PATH` to match the Gateway API stage name (e.g., `/v1`)
391423

424+
- **Feature Configuration**: Control which features are enabled:
425+
- `ENABLE_COLLECTIONS_SEARCH`: Set to `true` (default) to enable collection search extensions (sort, fields). Set to `false` to disable.
426+
- `ENABLE_TRANSACTIONS_EXTENSIONS`: Set to `true` (default) to enable transaction extensions. Set to `false` to disable.
427+
392428

393429
## Collection Pagination
394430

docker-compose.redis.yml

Lines changed: 0 additions & 27 deletions
This file was deleted.

dockerfiles/Dockerfile.dev.es

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,3 +18,4 @@ COPY . /app
1818
RUN pip install --no-cache-dir -e ./stac_fastapi/core
1919
RUN pip install --no-cache-dir -e ./stac_fastapi/sfeos_helpers
2020
RUN pip install --no-cache-dir -e ./stac_fastapi/elasticsearch[dev,server]
21+
RUN pip install --no-cache-dir redis types-redis

mypy.ini

Lines changed: 0 additions & 3 deletions
This file was deleted.

stac_fastapi/core/stac_fastapi/core/core.py

Lines changed: 33 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -230,11 +230,18 @@ async def landing_page(self, **kwargs) -> stac_types.LandingPage:
230230
return landing_page
231231

232232
async def all_collections(
233-
self, sortby: Optional[str] = None, **kwargs
233+
self,
234+
fields: Optional[List[str]] = None,
235+
sortby: Optional[str] = None,
236+
q: Optional[Union[str, List[str]]] = None,
237+
**kwargs,
234238
) -> stac_types.Collections:
235239
"""Read all collections from the database.
236240
237241
Args:
242+
fields (Optional[List[str]]): Fields to include or exclude from the results.
243+
sortby (Optional[str]): Sorting options for the results.
244+
q (Optional[List[str]]): Free text search terms.
238245
**kwargs: Keyword arguments from the request.
239246
240247
Returns:
@@ -245,6 +252,15 @@ async def all_collections(
245252
limit = int(request.query_params.get("limit", os.getenv("STAC_ITEM_LIMIT", 10)))
246253
token = request.query_params.get("token")
247254

255+
# Process fields parameter for filtering collection properties
256+
includes, excludes = set(), set()
257+
if fields and self.extension_is_enabled("FieldsExtension"):
258+
for field in fields:
259+
if field[0] == "-":
260+
excludes.add(field[1:])
261+
else:
262+
includes.add(field[1:] if field[0] in "+ " else field)
263+
248264
sort = None
249265
if sortby:
250266
parsed_sort = []
@@ -267,10 +283,24 @@ async def all_collections(
267283
except Exception:
268284
redis = None
269285

286+
# Convert q to a list if it's a string
287+
q_list = None
288+
if q is not None:
289+
q_list = [q] if isinstance(q, str) else q
290+
270291
collections, next_token = await self.database.get_all_collections(
271-
token=token, limit=limit, request=request, sort=sort
292+
token=token, limit=limit, request=request, sort=sort, q=q_list
272293
)
273294

295+
# Apply field filtering if fields parameter was provided
296+
if fields and self.extension_is_enabled("FieldsExtension"):
297+
filtered_collections = [
298+
filter_fields(collection, includes, excludes)
299+
for collection in collections
300+
]
301+
else:
302+
filtered_collections = collections
303+
274304
links = [
275305
{"rel": Relations.root.value, "type": MimeTypes.json, "href": base_url},
276306
{"rel": Relations.parent.value, "type": MimeTypes.json, "href": base_url},
@@ -301,7 +331,7 @@ async def all_collections(
301331
next_link = PagingLinks(next=next_token, request=request).link_next()
302332
links.append(next_link)
303333

304-
return stac_types.Collections(collections=collections, links=links)
334+
return stac_types.Collections(collections=filtered_collections, links=links)
305335

306336
async def get_collection(
307337
self, collection_id: str, **kwargs

stac_fastapi/core/stac_fastapi/core/redis_utils.py

Lines changed: 30 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ class RedisSentinelSettings(BaseSettings):
1515
REDIS_SENTINEL_HOSTS: str = ""
1616
REDIS_SENTINEL_PORTS: str = "26379"
1717
REDIS_SENTINEL_MASTER_NAME: str = "master"
18-
REDIS_DB: int = 15
18+
REDIS_DB: int = 0
1919

2020
REDIS_MAX_CONNECTIONS: int = 10
2121
REDIS_RETRY_TIMEOUT: bool = True
@@ -25,7 +25,7 @@ class RedisSentinelSettings(BaseSettings):
2525

2626

2727
class RedisSettings(BaseSettings):
28-
"""Configuration for connecting Redis Sentinel."""
28+
"""Configuration for connecting Redis."""
2929

3030
REDIS_HOST: str = ""
3131
REDIS_PORT: int = 6379
@@ -42,36 +42,11 @@ class RedisSettings(BaseSettings):
4242
redis_settings: BaseSettings = RedisSentinelSettings()
4343

4444

45-
async def connect_redis(settings: Optional[RedisSettings] = None) -> aioredis.Redis:
46-
"""Return a Redis connection."""
47-
global redis_pool
48-
settings = settings or redis_settings
49-
50-
if not settings.REDIS_HOST or not settings.REDIS_PORT:
51-
return None
52-
53-
if redis_pool is None:
54-
pool = aioredis.ConnectionPool(
55-
host=settings.REDIS_HOST,
56-
port=settings.REDIS_PORT,
57-
db=settings.REDIS_DB,
58-
max_connections=settings.REDIS_MAX_CONNECTIONS,
59-
decode_responses=settings.REDIS_DECODE_RESPONSES,
60-
retry_on_timeout=settings.REDIS_RETRY_TIMEOUT,
61-
health_check_interval=settings.REDIS_HEALTH_CHECK_INTERVAL,
62-
)
63-
redis_pool = aioredis.Redis(
64-
connection_pool=pool, client_name=settings.REDIS_CLIENT_NAME
65-
)
66-
return redis_pool
67-
68-
6945
async def connect_redis_sentinel(
7046
settings: Optional[RedisSentinelSettings] = None,
7147
) -> Optional[aioredis.Redis]:
72-
"""Return a Redis Sentinel connection."""
48+
"""Return Redis Sentinel connection."""
7349
global redis_pool
74-
7550
settings = settings or redis_settings
7651

7752
if (
@@ -89,7 +64,7 @@ async def connect_redis_sentinel(
8964
if redis_pool is None:
9065
try:
9166
sentinel = Sentinel(
92-
[(h, p) for h, p in zip(hosts, ports)],
67+
[(host, port) for host, port in zip(hosts, ports)],
9368
decode_responses=settings.REDIS_DECODE_RESPONSES,
9469
)
9570
master = sentinel.master_for(
@@ -109,16 +84,40 @@ async def connect_redis_sentinel(
10984
return redis_pool
11085

11186

87+
async def connect_redis(settings: Optional[RedisSettings] = None) -> aioredis.Redis:
88+
"""Return Redis connection."""
89+
global redis_pool
90+
settings = settings or redis_settings
91+
92+
if not settings.REDIS_HOST or not settings.REDIS_PORT:
93+
return None
94+
95+
if redis_pool is None:
96+
pool = aioredis.ConnectionPool(
97+
host=settings.REDIS_HOST,
98+
port=settings.REDIS_PORT,
99+
db=settings.REDIS_DB,
100+
max_connections=settings.REDIS_MAX_CONNECTIONS,
101+
decode_responses=settings.REDIS_DECODE_RESPONSES,
102+
retry_on_timeout=settings.REDIS_RETRY_TIMEOUT,
103+
health_check_interval=settings.REDIS_HEALTH_CHECK_INTERVAL,
104+
)
105+
redis_pool = aioredis.Redis(
106+
connection_pool=pool, client_name=settings.REDIS_CLIENT_NAME
107+
)
108+
return redis_pool
109+
110+
112111
async def save_self_link(
113112
redis: aioredis.Redis, token: Optional[str], self_href: str
114113
) -> None:
115-
"""Save the self link for the current token with 30 min TTL."""
114+
"""Add the self link for next page as prev link for the current token."""
116115
if token:
117116
await redis.setex(f"nav:self:{token}", 1800, self_href)
118117

119118

120119
async def get_prev_link(redis: aioredis.Redis, token: Optional[str]) -> Optional[str]:
121-
"""Get the previous page link for the current token (if exists)."""
120+
"""Pull the prev page link for the current token."""
122121
if not token:
123122
return None
124123
return await redis.get(f"nav:self:{token}")

0 commit comments

Comments
 (0)