Skip to content

Conversation

varshaprasad96
Copy link
Contributor

What does this PR do?

This PR implements keyword search in Qdrant.

Part of: #3009

Test Plan

pytest tests/integration/vector_io/test_openai_vector_stores.py::test_openai_vector_store_search_modes --stack-config=http://localhost:8321 --embedding-model=all-minilm -v
INFO     2025-08-11 15:36:48,421 tests.integration.conftest:64 tests: Setting DISABLE_CODE_SANDBOX=1 for macOS                                        
=========================================================================================== test session starts ============================================================================================
platform darwin -- Python 3.12.11, pytest-7.4.4, pluggy-1.5.0 -- /Users/vnarsing/miniconda3/envs/stack-client/bin/python
cachedir: .pytest_cache
metadata: {'Python': '3.12.11', 'Platform': 'macOS-14.7.6-arm64-arm-64bit', 'Packages': {'pytest': '7.4.4', 'pluggy': '1.5.0'}, 'Plugins': {'asyncio': '0.23.8', 'cov': '6.0.0', 'timeout': '2.2.0', 'socket': '0.7.0', 'html': '3.1.1', 'langsmith': '0.3.39', 'anyio': '4.8.0', 'metadata': '3.0.0'}}
rootdir: /Users/vnarsing/go/src/github/meta-llama/llama-stack
configfile: pyproject.toml
plugins: asyncio-0.23.8, cov-6.0.0, timeout-2.2.0, socket-0.7.0, html-3.1.1, langsmith-0.3.39, anyio-4.8.0, metadata-3.0.0
asyncio: mode=Mode.AUTO
collected 3 items                                                                                                                                                                                          

tests/integration/vector_io/test_openai_vector_stores.py::test_openai_vector_store_search_modes[None-None-all-minilm-None-384-vector] PASSED                                                         [ 33%]
tests/integration/vector_io/test_openai_vector_stores.py::test_openai_vector_store_search_modes[None-None-all-minilm-None-384-keyword] PASSED                                                        [ 66%]
tests/integration/vector_io/test_openai_vector_stores.py::test_openai_vector_store_search_modes[None-None-all-minilm-None-384-hybrid] SKIPPED (Search mode 'hybrid' is not supported by any avai...) [100%]

======================================================================================= 2 passed, 1 skipped in 0.27s =======================================================================================

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Aug 11, 2025
@varshaprasad96
Copy link
Contributor Author

cc: @franciscojavierarceo can you take a look please?

results = (
await self.client.query_points(
collection_name=self.collection_name,
query_filter=models.Filter(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice, looks like this is the appropriate way to handle things https://qdrant.tech/documentation/concepts/filtering/#full-text-match

Copy link
Collaborator

@franciscojavierarceo franciscojavierarceo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you update vector_io/conftest.py to test qdrant? I noticed earlier today that the fixture didn't have it enabled. See here:

@pytest.fixture(params=["milvus", "sqlite_vec", "faiss", "chroma"])
def vector_provider(request):
    return request.param

@varshaprasad96 varshaprasad96 force-pushed the qdrant-keyword-search branch 2 times, most recently from 5e318d0 to af7f51d Compare August 13, 2025 05:54
@varshaprasad96
Copy link
Contributor Author

varshaprasad96 commented Aug 13, 2025

Looking into fixing unit tests.
EDIT: Fixed.

@varshaprasad96 varshaprasad96 force-pushed the qdrant-keyword-search branch 3 times, most recently from be0d6d3 to 7da7ce9 Compare August 13, 2025 21:19
@varshaprasad96
Copy link
Contributor Author

@franciscojavierarceo could you ptal ?

@varshaprasad96 varshaprasad96 force-pushed the qdrant-keyword-search branch 3 times, most recently from 4e5ac14 to 5ce1492 Compare August 15, 2025 22:17
@varshaprasad96 varshaprasad96 requested a review from ashwinb August 15, 2025 22:17
@varshaprasad96
Copy link
Contributor Author

@ashwinb @franciscojavierarceo can we have some eyes on this to get this merged. Thanks!

assert len(response_negative_threshold.chunks) > 0


async def test_query_chunks_keyword_search_edge_cases(qdrant_vec_index, sample_chunks, sample_embeddings):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this test actually needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see why we shouldn't have it. This is lightweight and only testing that a valid QueryChunksResponse is returned without crashing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm intentionally not testing the return specifics, as that depends on the provider. But making sure that these values are at least accepted so that the input tests are at least robust.

This commit implements keyword search in Qdrant.

Signed-off-by: Varsha Prasad Narsing <[email protected]>

@pytest.fixture
def qdrant_vec_db_path(tmp_path_factory):
def qdrant_vec_db_path(tmp_path):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why change from tmp_path_factory?


adapter.initialize = safe_initialize
await adapter.initialize()
await adapter.register_vector_db(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this no longer needed? definitely right thing to do but just want to confirm.

Copy link
Collaborator

@franciscojavierarceo franciscojavierarceo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small nit on the tests but this looks good to me and it'd be good to unlock this for the community.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants