Skip to content

Conversation

@ahmed-bhs
Copy link

Q A
Bug fix? no
New feature? yes
Docs? no
Issues
License MIT

Problem

Choosing between vector search and full-text search forces a trade-off:

  • Vector search: Best for semantic similarity but may rank exact term matches lower
  • Full-text search: Best for lexical matching but misses semantic relationships

Users often need both in the same query: conceptual understanding + lexical precision.

Solution

Hybrid search combining both using Reciprocal Rank Fusion (RRF), following Supabase's approach.

RRF merges rankings from vector similarity and PostgreSQL Full-Text Search (ts_rank_cd).

Features

  • Configurable ratio: 0.0 (FTS) to 1.0 (vector)
  • RRF fusion: k=60 default (same as Supabase)
  • Multilingual: Language-agnostic by default
  • Optional filtering: defaultMaxScore to filter irrelevant results

Implementation

  • PostgreSQL FTS (ts_rank_cd) + pgvector
  • GIN index for FTS performance
  • Generated tsvector column

Example

$store = new PostgresHybridStore(
    connection: $pdo,
    tableName: 'documents',
    semanticRatio: 0.5,  // 50% vector + 50% FTS
    rrfK: 60,
);

$results = $store->query($vector, ['q' => 'PostgreSQL', 'limit' => 10]);

@carsonbot carsonbot added Feature New feature Store Issues & PRs about the AI Store component Status: Needs Review labels Oct 15, 2025
Combines pgvector semantic search with PostgreSQL Full-Text Search
using Reciprocal Rank Fusion (RRF), following Supabase approach.

Features:
- Configurable semantic/keyword ratio (0.0 to 1.0)
- RRF fusion with customizable k parameter
- Multilingual FTS support (default: 'simple')
- Optional relevance filtering with defaultMaxScore
- All pgvector distance metrics supported
@ahmed-bhs ahmed-bhs force-pushed the feature/postgres-hybrid-search branch from 1284fcf to 6c7c7e3 Compare October 15, 2025 12:56
@ahmed-bhs ahmed-bhs force-pushed the feature/postgres-hybrid-search branch from 3807878 to 8d4ccfe Compare October 16, 2025 07:36
@chr-hertel chr-hertel requested a review from Copilot October 23, 2025 19:06
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces PostgresHybridStore, a new vector store implementation that combines semantic vector search (pgvector) with PostgreSQL Full-Text Search (FTS) using Reciprocal Rank Fusion (RRF), following Supabase's hybrid search approach.

Key changes:

  • Implements configurable hybrid search with adjustable semantic ratio (0.0 for pure FTS, 1.0 for pure vector, 0.5 for balanced)
  • Uses RRF algorithm with k=60 default to merge vector similarity and ts_rank_cd rankings
  • Supports multilingual content through configurable PostgreSQL text search configurations

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File Description
src/store/src/Bridge/Postgres/PostgresHybridStore.php Core implementation of hybrid store with vector/FTS query building, RRF fusion logic, and table setup with tsvector generation
src/store/tests/Bridge/Postgres/PostgresHybridStoreTest.php Comprehensive test coverage for constructor validation, setup, pure vector/FTS queries, hybrid RRF queries, and various configuration options

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment on lines 259 to 260
// Combine existing WHERE clause with FTS filter
$whereClause = str_replace('WHERE ', "WHERE $ftsFilter AND ", $whereClause);
Copy link

Copilot AI Oct 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using str_replace to inject the FTS filter is fragile and could fail if the WHERE clause contains the substring 'WHERE ' in other contexts (e.g., in a string literal or subquery). Consider using a more robust approach such as parsing the clause structure or prepending the filter condition with proper AND logic.

Suggested change
// Combine existing WHERE clause with FTS filter
$whereClause = str_replace('WHERE ', "WHERE $ftsFilter AND ", $whereClause);
// Combine existing WHERE clause with FTS filter using AND logic
// Assumes $whereClause starts with 'WHERE '
$whereClause = rtrim($whereClause);
if (str_starts_with($whereClause, 'WHERE ')) {
$whereClause .= " AND $ftsFilter";
} else {
// Unexpected format, prepend WHERE
$whereClause = "WHERE $ftsFilter AND " . ltrim($whereClause);
}

Copilot uses AI. Check for mistakes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this caught my eyes as well and i was wondering if the where building can be brought into one single place instead of spreading it over three methods - what do you think?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! I've centralized the WHERE clause building logic into a dedicated addFilterToWhereClause() helper method. This removes the duplication across buildFtsOnlyQuery()
and buildHybridQuery(), and also makes the str_replace approach more robust by using str_starts_with() instead.

$ftsFilter = \sprintf("content_tsv @@ websearch_to_tsquery('%s', :query)", $this->language);

if ('' !== $whereClause) {
$ftsWhereClause = str_replace('WHERE ', "WHERE $ftsFilter AND ", $whereClause);
Copy link

Copilot AI Oct 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue as in buildFtsOnlyQuery: using str_replace to inject the FTS filter is fragile and could produce incorrect SQL if 'WHERE ' appears in unexpected contexts. Consider a more robust approach to combining WHERE conditions.

Copilot uses AI. Check for mistakes.
Copy link
Member

@chr-hertel chr-hertel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general this is a super cool feature - some copilot findings seem valid to me - please check.

On top, I was unsure if all sprintf need to be sprintf or some values can/should be a prepared parameter - that'd be great to double check as well please.

*
* @author Ahmed EBEN HASSINE <ahmedbhs123@æmail.com>
*/
final readonly class PostgresHybridStore implements ManagedStoreInterface, StoreInterface
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's just call it HybridStore instead

Suggested change
final readonly class PostgresHybridStore implements ManagedStoreInterface, StoreInterface
final readonly class HybridStore implements ManagedStoreInterface, StoreInterface

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chr-hertel Thanks for reviewing. I've just renamed it.

- Extract WHERE clause logic into addFilterToWhereClause() helper method
- Fix embedding param logic: ensure it's set before maxScore uses it
- Replace fragile str_replace() with robust str_starts_with() approach
- Remove code duplication between buildFtsOnlyQuery and buildHybridQuery

This addresses review feedback about fragile WHERE clause manipulation
and centralizes the logic in a single, reusable method.
- Rename class from PostgresHybridStore to HybridStore
- The namespace already indicates it's Postgres-specific
- Add postgres-hybrid.php RAG example demonstrating:
  * Different semantic ratios (0.0, 0.5, 1.0)
  * RRF (Reciprocal Rank Fusion) hybrid search
  * Full-text search with 'q' parameter
  * Per-query semanticRatio override
ahmed-bhs added a commit to ahmed-bhs/ai-demo that referenced this pull request Oct 30, 2025
Side-by-side comparison of FTS, Hybrid (RRF), and Semantic search.
Uses Supabase (pgvector + PostgreSQL FTS).
30 sample articles with interactive Live Component.

Related: symfony/ai#783
Author: Ahmed EBEN HASSINE <[email protected]>
@chr-hertel
Copy link
Member

@ahmed-bhs could you please have a look at the pipeline failures - i think there's still some minor parts open

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Feature New feature Status: Needs Review Store Issues & PRs about the AI Store component

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants