GitHub - couchbase-examples/rag-demo-llama-index: A RAG demo using LlamaIndex that allows you to chat with your uploaded PDF documents

RAG Demo using Couchbase, Streamlit, LlamaIndex, and OpenAI

This is a demo app built to chat with your custom PDFs using the vector search capabilities of Couchbase to augment the OpenAI results in a Retrieval-Augmented-Generation (RAG) model.

Two Vector Search Implementations

This demo provides two implementations showcasing different Couchbase vector search approaches:

FTS-Based Vector Search (FTS/chat_with_pdf_search.py) - Uses Full Text Search indexes
GSI-Based Vector Search (GSI/chat_with_pdf_query.py) - Uses Global Secondary Indexes

How does it work?

You can upload your PDFs with custom data & ask questions about the data in the chat box.

For each question, you will get two answers:

one using RAG (Couchbase logo)
one using pure LLM - OpenAI (🤖).

For RAG, we are using LlamaIndex, Couchbase Vector Search & OpenAI. We fetch parts of the PDF relevant to the question using Vector search & add it as the context to the LLM. The LLM is instructed to answer based on the context from the Vector Store.

Setup Instructions

Install dependencies

pip install -r requirements.txt

Set the environment secrets

Copy the secrets.example.toml file in .streamlit folder and rename it to secrets.toml and replace the placeholders with the actual values for your environment.

For FTS Vector Search (FTS/chat_with_pdf_search.py):

OPENAI_API_KEY = "<open_ai_api_key>"
DB_CONN_STR = "<connection_string_for_couchbase_cluster>"
DB_USERNAME = "<username_for_couchbase_cluster>"
DB_PASSWORD = "<password_for_couchbase_cluster>"
DB_BUCKET = "<name_of_bucket_to_store_documents>"
DB_SCOPE = "<name_of_scope_to_store_documents>"
DB_COLLECTION = "<name_of_collection_to_store_documents>"
INDEX_NAME = "<name_of_fts_index_with_vector_support>"
AUTH_ENABLED = "False"
LOGIN_PASSWORD = "<password_to_access_the_streamlit_app>"
# Required for streamlit cloud as downloads are restricted to default locations
NLTK_DATA = "/tmp/nltk-corpora"
TIKTOKEN_CACHE_DIR = "/tmp/tiktoken-cache"

For GSI Vector Search (GSI/chat_with_pdf_query.py):

OPENAI_API_KEY = "<open_ai_api_key>"
DB_CONN_STR = "<connection_string_for_couchbase_cluster>"
DB_USERNAME = "<username_for_couchbase_cluster>"
DB_PASSWORD = "<password_for_couchbase_cluster>"
DB_BUCKET = "<name_of_bucket_to_store_documents>"
DB_SCOPE = "<name_of_scope_to_store_documents>"
DB_COLLECTION = "<name_of_collection_to_store_documents>"
AUTH_ENABLED = "False"
LOGIN_PASSWORD = "<password_to_access_the_streamlit_app>"
# Required for streamlit cloud as downloads are restricted to default locations
NLTK_DATA = "/tmp/nltk-corpora"
TIKTOKEN_CACHE_DIR = "/tmp/tiktoken-cache"

Note: GSI approach does not require the INDEX_NAME parameter.

The last two parameters are required only if you are deploying on the streamlit cloud.

Approach 1: FTS-Based Vector Search

Prerequisites

Couchbase Server 7.6+ or Couchbase Capella

FTS Index Creation

The application automatically creates the FTS index when it starts up using the create_fts_index() function. The index is created with the following configuration:

Index Name: Specified by the INDEX_NAME environment variable
Vector field: embedding with 1536 dimensions
Text field: text (indexed and stored)
Similarity metric: dot_product
Vector optimization: Optimized for recall

The index definition uses dynamic type mapping based on your scope and collection names (e.g., {scope_name}.{collection_name}).

If you prefer to create the index manually through the Couchbase UI, you can do so:

Couchbase Capella
- Import the index.json file in FTS fodler in Capella using the instructions in the above documentation.

Run the FTS application

streamlit run FTS/chat_with_pdf_search.py

Approach 2: GSI-Based Vector Search

Prerequisites

Couchbase Server 8.0+ or Couchbase Capella

This approach uses CouchbaseQueryVectorStore which leverages Global Secondary Index (GSI) for vector search. The vector search is performed using SQL++ queries with cosine similarity distance metric.

Understanding Vector Index Types

Couchbase offers different types of vector indexes for GSI-based vector search:

Hyperscale Vector Indexes (BHIVE)

Best for pure vector searches - content discovery, recommendations, semantic search
High performance with low memory footprint - designed to scale to billions of vectors
Optimized for concurrent operations - supports simultaneous searches and inserts
Use when: You primarily perform vector-only queries without complex scalar filtering
Ideal for: Large-scale semantic search, recommendation systems, content discovery

Composite Vector Indexes

Best for filtered vector searches - combines vector search with scalar value filtering
Efficient pre-filtering - scalar attributes reduce the vector comparison scope
Use when: Your queries combine vector similarity with scalar filters that eliminate large portions of data
Ideal for: Compliance-based filtering, user-specific searches, time-bounded queries

Choosing the Right Index Type

Start with Hyperscale Vector Index for pure vector searches and large datasets
Use Composite Vector Index when scalar filters significantly reduce your search space
Consider your dataset size: Hyperscale scales to billions, Composite works well for tens of millions to billions

For more details, see the Couchbase Vector Index documentation.

Important: The vector index should be created after ingesting the documents (uploading PDFs).

Example of Creating Vector Index via SQL++:

After uploading your PDFs, vector index is create using the below SQL++ query executed through the application. The application includes a create_vector_index() function that creates the index with the following configuration:

# Example of how the vector index is created in the code
create_query_string = f"""
CREATE INDEX `idx_vector_embedding` 
ON `{collection_name}` (vector VECTOR) 
USING GSI 
WITH {{
  "dimension": 1536,
  "description": "IVF,SQ8",
  "similarity": "cosine"
}}
"""

The function:

Checks if the index already exists before creating
Creates a GSI vector index on the vector field
Configures the index with 1536 dimensions (matching OpenAI embeddings)
Uses cosine similarity for distance calculations
Applies IVF,SQ8 quantization for optimized performance

Understanding Index Configuration Parameters:

The description parameter controls how Couchbase optimizes vector storage and search performance:

Format: 'IVF[<centroids>],{PQ|SQ}<settings>'

Centroids (IVF - Inverted File):

Controls how the dataset is subdivided for faster searches
More centroids = faster search, slower training
Fewer centroids = slower search, faster training
If omitted (like IVF,SQ8), Couchbase auto-selects based on dataset size

Quantization Options:

SQ (Scalar Quantization): SQ4, SQ6, SQ8 (4, 6, or 8 bits per dimension)
PQ (Product Quantization): PQ<subquantizers>x<bits> (e.g., PQ32x8)
Higher values = better accuracy, larger index size

Common Examples:

IVF,SQ8 - Auto centroids, 8-bit scalar quantization (good default)
IVF1000,SQ6 - 1000 centroids, 6-bit scalar quantization
IVF,PQ32x8 - Auto centroids, 32 subquantizers with 8 bits

For detailed configuration options, see the Quantization & Centroid Settings.

Note: In GSI vector search, the distance represents the vector distance between the query and document embeddings. Lower distance indicates higher similarity, while higher distance indicates lower similarity. This demo uses cosine similarity for measuring document relevance.

Run the GSI application

streamlit run GSI/chat_with_pdf_query.py

Note: Upload a PDF document before asking questions, however the application still works if the data is already present in the capella.

Name		Name	Last commit message	Last commit date
Latest commit History 110 Commits
.github		.github
.streamlit		.streamlit
FTS		FTS
GSI		GSI
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

RAG Demo using Couchbase, Streamlit, LlamaIndex, and OpenAI

Two Vector Search Implementations

How does it work?

Setup Instructions

Install dependencies

Set the environment secrets

Approach 1: FTS-Based Vector Search

Prerequisites

FTS Index Creation

Run the FTS application

Approach 2: GSI-Based Vector Search

Prerequisites

Understanding Vector Index Types

Run the GSI application

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 5

Uh oh!

Languages

License

couchbase-examples/rag-demo-llama-index

Folders and files

Latest commit

History

Repository files navigation

RAG Demo using Couchbase, Streamlit, LlamaIndex, and OpenAI

Two Vector Search Implementations

How does it work?

Setup Instructions

Install dependencies

Set the environment secrets

Approach 1: FTS-Based Vector Search

Prerequisites

FTS Index Creation

Run the FTS application

Approach 2: GSI-Based Vector Search

Prerequisites

Understanding Vector Index Types

Run the GSI application

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 5

Uh oh!

Languages

Packages