diff --git a/src/content/changelog/ai-search/2025-10-27-ai-search-reranking-system-prompt.mdx b/src/content/changelog/ai-search/2025-10-27-ai-search-reranking-system-prompt.mdx new file mode 100644 index 000000000000000..2213e07a5d6afe1 --- /dev/null +++ b/src/content/changelog/ai-search/2025-10-27-ai-search-reranking-system-prompt.mdx @@ -0,0 +1,54 @@ +--- +title: Reranking and API-based system prompt configuration in AI Search +description: Improve result accuracy with reranking and dynamically control AI Search responses by setting system prompts in API requests. +products: + - ai-search +date: 2025-10-28 +--- + +[AI Search](/ai-search/) now supports reranking for improved retrieval quality and allows you to set the system prompt directly in your API requests. + +## Rerank for more relevant results + +You can now enable [reranking](/ai-search/configuration/reranking/) to reorder retrieved documents based on their semantic relevance to the user’s query. Reranking helps improve accuracy, especially for large or noisy datasets where vector similarity alone may not produce the optimal ordering. + +You can enable and configure reranking in the dashboard or directly in your API requests: + +```javascript +const answer = await env.AI.autorag("my-autorag").aiSearch({ + query: "How do I train a llama to deliver coffee?", + model: "@cf/meta/llama-3.3-70b-instruct-fp8-fast", + reranking: { + enabled: true, + model: "@cf/baai/bge-reranker-base" + } +}); +``` + +## Set system prompts in API + +Previously, [system prompts](/ai-search/configuration/system-prompt/) could only be configured in the dashboard. You can now define them directly in your API requests, giving you per-query control over behavior. For example: + +```javascript +// Dynamically set query and system prompt in AI Search +async function getAnswer(query, tone) { + const systemPrompt = `You are a ${tone} assistant.`; + + const response = await env.AI.autorag("my-autorag").aiSearch({ + query: query, + system_prompt: systemPrompt + }); + + return response; +} + +// Example usage +const query = "What is Cloudflare?"; +const tone = "friendly"; + +const answer = await getAnswer(query, tone); +console.log(answer); +``` + +Learn more about [Reranking](/ai-search/configuration/reranking/) and [System Prompt](/ai-search/configuration/system-prompt/) in AI Search. + diff --git a/src/content/docs/ai-search/configuration/index.mdx b/src/content/docs/ai-search/configuration/index.mdx index ce3e04f95130574..cb0c654887a2e48 100644 --- a/src/content/docs/ai-search/configuration/index.mdx +++ b/src/content/docs/ai-search/configuration/index.mdx @@ -22,6 +22,7 @@ The table below lists all available configuration options: | [Query rewrite system prompt](/ai-search/configuration/system-prompt/) | yes | Custom system prompt to guide query rewriting behavior | | [Match threshold](/ai-search/configuration/retrieval-configuration/) | yes | Minimum similarity score required for a vector match | | [Maximum number of results](/ai-search/configuration/retrieval-configuration/) | yes | Maximum number of vector matches returned (`top_k`) | +| [Reranking](/ai-search/configuration/reranking/) | yes | Rerank to reorder retrieved results by semantic relevance using a reranking model after initial retrieval | | [Generation model](/ai-search/configuration/models/) | yes | Model used to generate the final response | | [Generation system prompt](/ai-search/configuration/system-prompt/) | yes | Custom system prompt to guide response generation | | [Similarity caching](/ai-search/configuration/cache/) | yes | Enable or disable caching of responses for similar (not just exact) prompts | diff --git a/src/content/docs/ai-search/configuration/models/supported-models.mdx b/src/content/docs/ai-search/configuration/models/supported-models.mdx index b663e54a80e5dda..675d5abf5dbabba 100644 --- a/src/content/docs/ai-search/configuration/models/supported-models.mdx +++ b/src/content/docs/ai-search/configuration/models/supported-models.mdx @@ -50,6 +50,11 @@ Production models are the actively supported and recommended models that are sta | **Workers AI** | `@cf/baai/bge-m3` | 1,024 | 512 | cosine | | | `@cf/baai/bge-large-en-v1.5` | 1,024 | 512 | cosine | +### Reranking +| Provider | Alias | Input tokens | +|---|---|---| +| **Workers AI** | `@cf/baai/bge-reranker-base` | 512 | + ## Transition models There are currently no models marked for end-of-life. \ No newline at end of file diff --git a/src/content/docs/ai-search/configuration/reranking.mdx b/src/content/docs/ai-search/configuration/reranking.mdx new file mode 100644 index 000000000000000..190cdc37f66e8d8 --- /dev/null +++ b/src/content/docs/ai-search/configuration/reranking.mdx @@ -0,0 +1,74 @@ +--- +pcx_content_type: concept +title: Reranking +sidebar: + order: 4 +--- + +import { DashButton } from "~/components"; + +Reranking can help improve the quality of AI Search results by reordering retrieved documents based on semantic relevance to the user’s query. It applies a secondary model after retrieval to "rerank" the top results before they are outputted. + +## How it works + +By default, reranking is **disabled** for all AI Search instances. You can enable it during creation or later from the settings page. + +When enabled, AI Search will: + +1. Retrieve a set of relevant results from your index, constrained by your `max_num_of_results` and `score_threshold` parameters. +2. Pass those results through a [reranking model](/ai-search/configuration/models/supported-models/). +3. Return the reranked results, which the text generation model can use for answer generation. + +Reranking helps improve accuracy, especially for large or noisy datasets where vector similarity alone may not produce the optimal ordering. + +## Configuration + +You can configure reranking in several ways: + +### Configure via API + +When you make a `/search` or `/ai-search` request using the [Workers Binding](/ai-search/usage/workers-binding/) or [REST API](/ai-search/usage/rest-api/), you can: + +- Enable or disable reranking per request +- Specify the reranking model + +For example: + +```javascript +const answer = await env.AI.autorag("my-autorag").aiSearch({ + query: "How do I train a llama to deliver coffee?", + model: "@cf/meta/llama-3.3-70b-instruct-fp8-fast", + reranking: { + enabled: true, + model: "@cf/baai/bge-reranker-base" + } +}); +``` + +### Configure in dashboard for new AI Search + +When creating a new RAG in the dashboard: + +1. Go to **AI Search** in the Cloudflare dashboard. + + + +2. Select **Create** > **Get started**. +3. In the **Retrieval configuration** step, open the **Reranking** dropdown. +4. Toggle **Reranking** on. +5. Select the reranking model. +6. Complete your setup. + +### Configure in dashboard for existing AI Search + +To update reranking for an existing instance: + +1. Go to **AI Search** in the Cloudflare dashboard. + + + +2. Select an existing AI Search instance. +3. Go to the **Settings** tab. +4. Under **Reranking**, toggle reranking on. +5. Select the reranking model. + diff --git a/src/content/docs/ai-search/configuration/system-prompt.mdx b/src/content/docs/ai-search/configuration/system-prompt.mdx index 1388917e4e9048d..5979478ddb7940f 100644 --- a/src/content/docs/ai-search/configuration/system-prompt.mdx +++ b/src/content/docs/ai-search/configuration/system-prompt.mdx @@ -21,13 +21,7 @@ System prompts are particularly useful for: - Applying domain-specific tone or terminology - Encouraging consistent, high-quality output -## How to set your system prompt - -The system prompt for your AI Search can be set after it has been created by: - -1. Navigating to the [Cloudflare dashboard](https://dash.cloudflare.com/?to=/:account/ai/autorag), and go to AI > AI Search -2. Select your AI Search -3. Go to Settings page and find the System prompt setting for either Query rewrite or Generation +## System prompt configuration ### Default system prompt @@ -39,6 +33,31 @@ You can view the effective system prompt used for any AI Search's model call thr The default system prompt can change and evolve over time to improve performance and quality. ::: +### Configure via API + +When you make a `/ai-search` request using the [Workers Binding](/ai-search/usage/workers-binding/) or [REST API](/ai-search/usage/rest-api/), you can set the system prompt programmatically. + +For example: + +```javascript +const answer = await env.AI.autorag("my-autorag").aiSearch({ + query: "How do I train a llama to deliver coffee?", + model: "@cf/meta/llama-3.3-70b-instruct-fp8-fast", + system_prompt: "You are a helpful assistant." +}); +``` + +import { DashButton } from "~/components"; + +### Configure via Dashboard +The system prompt for your AI Search can be set after it has been created: + +1. Go to **AI Search** in the Cloudflare dashboard. + +2. Select an existing AI Search instance. +3. Go to the **Settings** tab. +4. Go to **Query rewrite** or **Generation**, and edit the **System prompt**. + ## Query rewriting system prompt If query rewriting is enabled, you can provide a custom system prompt to control how the model rewrites user queries. In this step, the model receives: diff --git a/src/content/docs/ai-search/usage/rest-api.mdx b/src/content/docs/ai-search/usage/rest-api.mdx index 96cdaf3c6083b45..4b672535d9c2ee8 100644 --- a/src/content/docs/ai-search/usage/rest-api.mdx +++ b/src/content/docs/ai-search/usage/rest-api.mdx @@ -52,7 +52,11 @@ curl https://api.cloudflare.com/client/v4/accounts/{ACCOUNT_ID}/ai-search/rags/{ "rewrite_query": false, "max_num_results": 10, "ranking_options": { - "score_threshold": 0.3 + "score_threshold": 0.3, + }, + "reranking": { + "enabled": true, + "model": "@cf/baai/bge-reranker-base" }, "stream": true, }' @@ -89,9 +93,12 @@ curl https://api.cloudflare.com/client/v4/accounts/{ACCOUNT_ID}/ai-search/rags/{ "rewrite_query": true, "max_num_results": 10, "ranking_options": { - "score_threshold": 0.3 + "score_threshold": 0.3, }, -}' + "reranking": { + "enabled": true, + "model": "@cf/baai/bge-reranker-base" + }' ``` diff --git a/src/content/docs/ai-search/usage/workers-binding.mdx b/src/content/docs/ai-search/usage/workers-binding.mdx index 0e585eaf9a3f747..db1efa2c6e4b137 100644 --- a/src/content/docs/ai-search/usage/workers-binding.mdx +++ b/src/content/docs/ai-search/usage/workers-binding.mdx @@ -46,8 +46,12 @@ const answer = await env.AI.autorag("my-autorag").aiSearch({ rewrite_query: true, max_num_results: 2, ranking_options: { - score_threshold: 0.3, + score_threshold: 0.3 }, + reranking: { + enabled: true, + model: "@cf/baai/bge-reranker-base" + }, stream: true, }); ``` @@ -115,8 +119,12 @@ const answer = await env.AI.autorag("my-autorag").search({ rewrite_query: true, max_num_results: 2, ranking_options: { - score_threshold: 0.3, + score_threshold: 0.3 }, + reranking: { + enabled: true, + model: "@cf/baai/bge-reranker-base" + } }); ``` diff --git a/src/content/docs/r2/api/tokens.mdx b/src/content/docs/r2/api/tokens.mdx index 902943b3ce3f31e..c6afa653fa57589 100644 --- a/src/content/docs/r2/api/tokens.mdx +++ b/src/content/docs/r2/api/tokens.mdx @@ -20,7 +20,7 @@ To create an API token: 1. In the Cloudflare dashboard, go to the **R2 object storage** page. -2. Select **Manage API tokens**. +2. Select **Manage in API tokens**. 3. Choose to create either: - **Create Account API token** - These tokens are tied to the Cloudflare account itself and can be used by any authorized system or user. Only users with the Super Administrator role can view or create them. These tokens remain valid until manually revoked. - **Create User API token** - These tokens are tied to your individual Cloudflare user. They inherit your personal permissions and become inactive if your user is removed from the account. diff --git a/src/content/partials/ai-search/ai-search-api-params.mdx b/src/content/partials/ai-search/ai-search-api-params.mdx index 859aa0a182febe9..689fa222b2350ec 100644 --- a/src/content/partials/ai-search/ai-search-api-params.mdx +++ b/src/content/partials/ai-search/ai-search-api-params.mdx @@ -12,6 +12,10 @@ The input query. The text-generation model that is used to generate the response for the query. For a list of valid options, check the AI Search Generation model Settings. Defaults to the generation model selected in the AI Search Settings. +`system_prompt` + +The system prompt for generating the answer. + `rewrite_query` Rewrites the original query into a search optimized query to improve retrieval accuracy. Defaults to `false`. @@ -27,6 +31,16 @@ Configurations for customizing result ranking. Defaults to `{}`. - `score_threshold` - The minimum match score required for a result to be considered a match. Defaults to `0`. Must be between `0` and `1`. +`reranking` + +Configurations for customizing reranking. Defaults to `{}`. + +- `enabled` + - Enables or disables reranking, which reorders retrieved results based on semantic relevance using a reranking model. Defaults to `false`. + +- `model` + - The reranking model to use when reranking is enabled. + `stream` Returns a stream of results as they are available. Defaults to `false`. diff --git a/src/content/partials/ai-search/search-api-params.mdx b/src/content/partials/ai-search/search-api-params.mdx index 17c3537c6a207d6..daf2a0777dc1222 100644 --- a/src/content/partials/ai-search/search-api-params.mdx +++ b/src/content/partials/ai-search/search-api-params.mdx @@ -23,6 +23,16 @@ Configurations for customizing result ranking. Defaults to `{}`. - `score_threshold` - The minimum match score required for a result to be considered a match. Defaults to `0`. Must be between `0` and `1`. +`reranking` + +Configurations for customizing reranking. Defaults to `{}`. + +- `enabled` + - Enables or disables reranking, which reorders retrieved results based on semantic relevance using a reranking model. Defaults to `false`. + +- `model` + - The reranking model to use when reranking is enabled. + `filters` Narrow down search results based on metadata, like folder and date, so only relevant content is retrieved. For more details, refer to [Metadata filtering](/ai-search/configuration/metadata).