Skip to content

Conversation

nipeone
Copy link

@nipeone nipeone commented Mar 29, 2025

Pull Request

Related issue

Fixes #625

What does this PR do?

  • ...

PR checklist

Please check if your PR fulfills the following requirements:

  • Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
  • Have you read the contributing guidelines?
  • Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!

Summary by CodeRabbit

  • New Features

    • Added support for configuring and managing AI-powered embedders for vector-based search, including settings to select embedder type, model, and vector dimensions.
    • Introduced semantic and hybrid search capabilities, allowing queries to combine keyword and vector-based results.
    • Enabled retrieval of document vectors and support for custom query vectors in search requests.
    • Added API to find documents similar to a given document using vector similarity.
  • Bug Fixes

    • Improved handling of boolean values in query string generation.
  • Tests

    • Added comprehensive tests for embedders settings, vector-based search, and similarity search.
    • Introduced new test datasets and helper classes for vector search scenarios.
  • Documentation

    • Added new code samples demonstrating similarity search with embedders.

@Strift Strift requested a review from Copilot April 4, 2025 04:34
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 17 out of 19 changed files in this pull request and generated no comments.

Files not reviewed (2)
  • tests/Meilisearch.Tests/Datasets/movies_with_vector.json: Language not supported
  • tests/Meilisearch.Tests/Meilisearch.Tests.csproj: Language not supported

Copy link

coderabbitai bot commented May 14, 2025

Walkthrough

This update introduces comprehensive support for AI-powered search and embedders in the Meilisearch client. It adds new classes and methods for configuring embedders, enables hybrid and vector search, supports the retrieval of similar documents, and enhances test coverage with new datasets and test cases for these features.

Changes

File(s) Change Summary
.code-samples.meilisearch.yaml Added a new code sample demonstrating GetSimilarDocumentsAsync usage.
src/Meilisearch/Converters/EmbedderSourceConverter.cs Introduced a custom JSON converter for serializing/deserializing EmbedderSource enum.
src/Meilisearch/Embedder.cs Added Embedder, Distribution classes, and EmbedderSource enum for embedder configuration.
src/Meilisearch/Extensions/ObjectExtensions.cs Modified ToQueryString to handle Boolean values explicitly as "true"/"false".
src/Meilisearch/Index.Documents.cs Added GetSimilarDocumentsAsync<T> method to retrieve similar documents.
src/Meilisearch/Index.Embedders.cs Added methods to get, update, and reset embedders settings (GetEmbeddersAsync, UpdateEmbeddersAsync, ResetEmbeddersAsync).
src/Meilisearch/QueryParameters/DocumentsQuery.cs Added RetrieveVectors property to DocumentsQuery.
src/Meilisearch/QueryParameters/SimilarDocumentsQuery.cs Introduced SimilarDocumentsQuery class for similar documents endpoint.
src/Meilisearch/SearchQuery.cs Added Hybrid, Vector, RetrieveVectors, and Locales properties to SearchQuery; introduced Hybrid class.
src/Meilisearch/Settings.cs Added Embedders property to Settings class.
src/Meilisearch/SimilarDocumentsResult.cs Added SimilarDocumentsResult<T> class for similar documents responses.
tests/Meilisearch.Tests/Datasets.cs Added MoviesWithVectorJsonPath field for new dataset.
tests/Meilisearch.Tests/Datasets/movies_with_vector.json Added new test dataset with movies and associated vectors.
tests/Meilisearch.Tests/DocumentTests.cs Added GetSimilarDocumentsAsync test for similar documents endpoint.
tests/Meilisearch.Tests/IndexFixture.cs Added SetUpIndexForVectorSearch method to prepare index with embedders and vector data.
tests/Meilisearch.Tests/Meilisearch.Tests.csproj Included new dataset in test project output.
tests/Meilisearch.Tests/Movie.cs Added MovieWithVector class for movies with vector data.
tests/Meilisearch.Tests/SearchTests.cs Added vector search test using hybrid search and vector queries.
tests/Meilisearch.Tests/SettingsTests.cs Added tests for getting, updating, and resetting embedders settings.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Index
    participant MeilisearchServer

    Client->>Index: GetSimilarDocumentsAsync(query)
    Index->>MeilisearchServer: POST /indexes/:uid/similar (with query)
    MeilisearchServer-->>Index: SimilarDocumentsResult<T>
    Index-->>Client: SimilarDocumentsResult<T>
Loading
sequenceDiagram
    participant Client
    participant Index
    participant MeilisearchServer

    Client->>Index: GetEmbeddersAsync()
    Index->>MeilisearchServer: GET /indexes/:uid/settings/embedders
    MeilisearchServer-->>Index: Dictionary<string, Embedder>
    Index-->>Client: Dictionary<string, Embedder>

    Client->>Index: UpdateEmbeddersAsync(embedders)
    Index->>MeilisearchServer: PATCH /indexes/:uid/settings/embedders
    MeilisearchServer-->>Index: TaskInfo
    Index-->>Client: TaskInfo

    Client->>Index: ResetEmbeddersAsync()
    Index->>MeilisearchServer: DELETE /indexes/:uid/settings/embedders
    MeilisearchServer-->>Index: TaskInfo
    Index-->>Client: TaskInfo
Loading
sequenceDiagram
    participant Client
    participant Index
    participant MeilisearchServer

    Client->>Index: SearchAsync(query with hybrid/vector/retrieveVectors)
    Index->>MeilisearchServer: POST /indexes/:uid/search (with query)
    MeilisearchServer-->>Index: SearchResponse (with semantic/vector results)
    Index-->>Client: SearchResponse
Loading

Assessment against linked issues

Objective Addressed Explanation
Embedders: Support get, update, reset methods and all specified subfields (source, apiKey, model, documentTemplate, dimensions, distribution, documentTemplateMaxBytes, revision, binaryQuantized) (#625)
Search: Support hybrid search parameter, vector parameter, retrieveVectors, and related response fields (#625)
Similar documents: Implement POST /indexes/:uid/similar via GetSimilarDocumentsAsync (#625)
Tests: Add test coverage for embedders, hybrid/vector search, and similar documents (#625)

Poem

In burrows deep, a search anew,
With vectors, embedders, and hybrid too!
Now rabbits find what's truly near,
With similar docs and meaning clear.
Hopping through code, we leap and cheer—
AI-powered search is finally here!
🐇✨

Tip

⚡️ Faster reviews with caching
  • CodeRabbit now supports caching for code and dependencies, helping speed up reviews. This means quicker feedback, reduced wait times, and a smoother review experience overall. Cached data is encrypted and stored securely. This feature will be automatically enabled for all accounts on May 16th. To opt out, configure Review - Disable Cache at either the organization or repository level. If you prefer to disable all data retention across your organization, simply turn off the Data Retention setting under your Organization Settings.

Enjoy the performance boost—your workflow just got faster.

✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@Strift Strift mentioned this pull request May 14, 2025
22 tasks
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (26)
src/Meilisearch/Settings.cs (1)

107-111: LGTM: Embedders property added correctly.

The new Embedders property is well-documented and properly typed as a Dictionary<string, Embedder> with the correct JSON property name attribute.

Consider initializing the dictionary to an empty collection to prevent potential null reference exceptions:

-public Dictionary<string, Embedder> Embedders { get; set; }
+public Dictionary<string, Embedder> Embedders { get; set; } = new Dictionary<string, Embedder>();
src/Meilisearch/Index.Documents.cs (2)

417-425: XML documentation needs improvement for parameters and return value.

The method documentation is incomplete. It should include descriptions for the type parameter, query parameter, and return value.

/// <summary>
/// Get similar documents with the allowed Query Parameters.
/// </summary>
-/// <typeparam name="T"></typeparam>
-/// <param name="query"></param>
-/// <param name="cancellationToken"></param>
-/// <returns></returns>
+/// <typeparam name="T">Type of the document to retrieve.</typeparam>
+/// <param name="query">Query parameters to customize the similarity search.</param>
+/// <param name="cancellationToken">The cancellation token for this call.</param>
+/// <returns>Returns a result containing similar documents and related metadata.</returns>

427-427: Fix formatting in try statement.

There should be a space between try and the opening brace.

-try{
+try {
tests/Meilisearch.Tests/SearchTests.cs (1)

560-577: Test implementation is good but could benefit from more comments.

The test correctly validates the vector search functionality with a specific vector query and appropriate assertions. Consider adding a comment explaining the expected behavior and why these specific values are used for testing.

[Fact]
public async Task CustomSearchWithVector()
{
+    // Test semantic search using a manual embedder with a specific vector
+    // Expected to return "Escape Room" as the closest match
    var searchQuery = new SearchQuery
    {
        Hybrid = new Hybrid
        {
            Embedder = "manual",
            SemanticRatio = 1.0f
        },
        Vector = new[] { 0.1f, 0.6f, 0.8f },
    };

    var movies = await _indexForVectorSearch.SearchAsync<MovieWithVector>(string.Empty, searchQuery);

    Assert.Equal("522681", movies.Hits.First().Id);
    Assert.Equal("Escape Room", movies.Hits.First().Title);
}
src/Meilisearch/Converters/EmbedderSourceConverter.cs (5)

7-14: Missing XML documentation details

The XML documentation for the EmbedderSourceConverter class is empty. Consider adding a meaningful description of the class's purpose to improve code documentation.

-    /// <summary>
-    /// 
-    /// </summary>
+    /// <summary>
+    /// Converts between JSON string values and EmbedderSource enum values during serialization and deserialization.
+    /// </summary>

12-19: Missing XML documentation details for Read method

The XML documentation for the Read method is empty. Consider adding appropriate descriptions for the method and its parameters.

-    /// <summary>
-    /// 
-    /// </summary>
-    /// <param name="reader"></param>
-    /// <param name="typeToConvert"></param>
-    /// <param name="options"></param>
-    /// <returns></returns>
+    /// <summary>
+    /// Reads and converts the JSON to an EmbedderSource enum value.
+    /// </summary>
+    /// <param name="reader">The reader to get the JSON value from.</param>
+    /// <param name="typeToConvert">The type to convert.</param>
+    /// <param name="options">The serializer options to use.</param>
+    /// <returns>The converted EmbedderSource value.</returns>

39-45: Missing XML documentation details for Write method

The XML documentation for the Write method is empty. Consider adding appropriate descriptions for the method and its parameters.

-    /// <summary>
-    /// 
-    /// </summary>
-    /// <param name="writer"></param>
-    /// <param name="value"></param>
-    /// <param name="options"></param>
+    /// <summary>
+    /// Writes an EmbedderSource enum value as a JSON string.
+    /// </summary>
+    /// <param name="writer">The writer to write to.</param>
+    /// <param name="value">The value to convert to JSON.</param>
+    /// <param name="options">The serializer options to use.</param>

21-36: Consider using a dictionary lookup for improved readability

The switch statement for the string-to-enum conversion could be replaced with a dictionary lookup for improved readability and maintainability.

-            var value = reader.GetString();
-            switch (value)
-            {
-                case "openAi":
-                    return EmbedderSource.OpenAi;
-                case "huggingFace":
-                    return EmbedderSource.HuggingFace;
-                case "ollama":
-                    return EmbedderSource.Ollama;
-                case "rest":
-                    return EmbedderSource.Rest;
-                case "userProvided":
-                    return EmbedderSource.UserProvided;
-                default:
-                    return EmbedderSource.Empty;
-            }
+            var value = reader.GetString();
+            var mapping = new Dictionary<string, EmbedderSource>
+            {
+                ["openAi"] = EmbedderSource.OpenAi,
+                ["huggingFace"] = EmbedderSource.HuggingFace,
+                ["ollama"] = EmbedderSource.Ollama,
+                ["rest"] = EmbedderSource.Rest,
+                ["userProvided"] = EmbedderSource.UserProvided
+            };
+            
+            return mapping.TryGetValue(value, out var source) ? source : EmbedderSource.Empty;

47-69: Consider using a dictionary lookup for improved readability

Similar to the Read method, the switch statement in the Write method could be replaced with a dictionary lookup for improved readability and maintainability.

-            string stringValue;
-            switch (value)
-            {
-                case EmbedderSource.OpenAi:
-                    stringValue = "openAi";
-                    break;
-                case EmbedderSource.HuggingFace:
-                    stringValue = "huggingFace";
-                    break;
-                case EmbedderSource.Ollama:
-                    stringValue = "ollama";
-                    break;
-                case EmbedderSource.Rest:
-                    stringValue = "rest";
-                    break;
-                case EmbedderSource.UserProvided:    
-                    stringValue = "userProvided";
-                    break;
-                default:    
-                    stringValue = string.Empty;
-                    break;
-            }
-            writer.WriteStringValue(stringValue);
+            var mapping = new Dictionary<EmbedderSource, string>
+            {
+                [EmbedderSource.OpenAi] = "openAi",
+                [EmbedderSource.HuggingFace] = "huggingFace",
+                [EmbedderSource.Ollama] = "ollama",
+                [EmbedderSource.Rest] = "rest",
+                [EmbedderSource.UserProvided] = "userProvided"
+            };
+            
+            writer.WriteStringValue(mapping.TryGetValue(value, out var stringValue) ? stringValue : string.Empty);
src/Meilisearch/QueryParameters/SimilarDocumentsQuery.cs (2)

15-16: Consider defining Id as string or using a more specific type

The Id property is defined as object, which is very generic. For document identifiers, consider using a more specific type like string or providing documentation about what types are acceptable.

-    [JsonPropertyName("id")]
-    public object Id { get; set; }
+    [JsonPropertyName("id")]
+    public string Id { get; set; }

If multiple types need to be supported, consider clarifying this in the documentation:

     /// <summary>
-    /// Identifier of the target document
+    /// Identifier of the target document. Can be a string, integer, or other primitive type.
     /// </summary>

1-71: Include fluent builder pattern for better usability

Consider implementing a fluent builder pattern for improved API usability and readability.

For example:

// Example usage with fluent pattern
var query = new SimilarDocumentsQuery(documentId)
    .WithEmbedder("default")
    .WithLimit(20)
    .WithOffset(0)
    .WithFilter("genre = 'action'")
    .WithAttributesToRetrieve(new[] { "title", "genre" })
    .WithRankingScore();

You can implement this by adding methods like:

public SimilarDocumentsQuery WithEmbedder(string embedder)
{
    Embedder = embedder;
    return this;
}

public SimilarDocumentsQuery WithLimit(int limit)
{
    Limit = limit;
    return this;
}

// Add similar methods for other properties
src/Meilisearch/SimilarDocumentsResult.cs (6)

1-1: Redundant blank line at file start

There's an unnecessary blank line at the beginning of the file. This should be removed to maintain consistent code style.

-
 using System.Collections.Generic;

9-30: Missing XML documentation details for constructor

The XML documentation for the constructor is incomplete. Add meaningful descriptions for each parameter to improve code documentation.

     /// <summary>
-    /// 
+    /// Initializes a new instance of the SimilarDocumentsResult class with the search results and metadata.
     /// </summary>
-    /// <param name="hits"></param>
-    /// <param name="id"></param>
-    /// <param name="offset"></param>
-    /// <param name="limit"></param>
-    /// <param name="estimatedTotalHits"></param>
-    /// <param name="processingTimeMs"></param>
-    /// <param name="indexUid"></param>
+    /// <param name="hits">The collection of documents that match the similarity criteria.</param>
+    /// <param name="id">The identifier of the document used as reference for similarity.</param>
+    /// <param name="offset">The number of documents skipped in the results.</param>
+    /// <param name="limit">The maximum number of documents to return.</param>
+    /// <param name="estimatedTotalHits">The estimated total number of documents that match the similarity criteria.</param>
+    /// <param name="processingTimeMs">The time taken to process the request in milliseconds.</param>
+    /// <param name="indexUid">The unique identifier of the index that was queried.</param>

35-37: Incorrect XML documentation tag

The <inheritdoc/> tag is used but there's no base class or interface method being inherited from. Replace with a proper documentation comment.

-    /// <inheritdoc/>
+    /// <summary>
+    /// Gets the collection of documents that match the similarity criteria.
+    /// </summary>
     [JsonPropertyName("hits")]
     public IReadOnlyCollection<T> Hits { get; }

57-63: Incorrect XML documentation tags

The <inheritdoc/> tags are used for ProcessingTimeMs and IndexUid properties but there's no base class or interface method being inherited from. Replace with proper documentation comments.

-    /// <inheritdoc/>
+    /// <summary>
+    /// Gets the time taken to process the request in milliseconds.
+    /// </summary>
     [JsonPropertyName("processingTimeMs")]
     public int ProcessingTimeMs { get; }

-    /// <inheritdoc/>
+    /// <summary>
+    /// Gets the unique identifier of the index that was queried.
+    /// </summary>
     [JsonPropertyName("indexUid")]
     public string IndexUid { get; }

19-21: Format constructor parameter list for better readability

The constructor parameter list is crowded. Consider formatting it with one parameter per line for better readability.

     public SimilarDocumentsResult(
-        IReadOnlyCollection<T> hits, string id, int offset,
-        int limit, int estimatedTotalHits,int processingTimeMs, string indexUid)
+        IReadOnlyCollection<T> hits, 
+        string id, 
+        int offset,
+        int limit, 
+        int estimatedTotalHits, 
+        int processingTimeMs, 
+        string indexUid)

19-21: Missing space after comma

There's a missing space after the comma between estimatedTotalHits and processingTimeMs parameters.

-        int limit, int estimatedTotalHits,int processingTimeMs, string indexUid)
+        int limit, int estimatedTotalHits, int processingTimeMs, string indexUid)
src/Meilisearch/SearchQuery.cs (2)

177-195: Consider adding validation for SemanticRatio property

The documentation states that semanticRatio must be a number between 0.0 and 1.0, but there's no validation to enforce this constraint. Consider adding validation logic to ensure values are within the allowed range.

[JsonPropertyName("semanticRatio")]
public float? SemanticRatio 
{ 
+   get => _semanticRatio; 
+   set 
+   {
+       if (value.HasValue && (value < 0.0f || value > 1.0f))
+       {
+           throw new ArgumentOutOfRangeException(nameof(value), "SemanticRatio must be between 0.0 and 1.0");
+       }
+       _semanticRatio = value;
+   }
}
+private float? _semanticRatio;

102-105: Fix typo in documentation comment

There's a typo in the documentation comment for ShowRankingScore property.

/// <summary>
- /// Gets or sets showRankingScore parameter. It defines wheter the global ranking score of a document (between 0 and 1) is returned or not.
+ /// Gets or sets showRankingScore parameter. It defines whether the global ranking score of a document (between 0 and 1) is returned or not.
/// </summary>
src/Meilisearch/Index.Embedders.cs (2)

22-35: Add null check for embedders parameter

Consider adding a null check for the embedders parameter to prevent NullReferenceException.

public async Task<TaskInfo> UpdateEmbeddersAsync(Dictionary<string, Embedder> embedders, CancellationToken cancellationToken = default)
{
+   if (embedders == null)
+   {
+       throw new ArgumentNullException(nameof(embedders));
+   }
    var responseMessage =
        await _http.PatchAsJsonAsync($"indexes/{Uid}/settings/embedders", embedders, Constants.JsonSerializerOptionsRemoveNulls, cancellationToken: cancellationToken)
            .ConfigureAwait(false);
    return await responseMessage.Content.ReadFromJsonAsync<TaskInfo>(cancellationToken: cancellationToken)
        .ConfigureAwait(false);
}

37-48: Consider enhancing error handling for the ResetEmbeddersAsync method

The method should handle potential HTTP errors more explicitly.

public async Task<TaskInfo> ResetEmbeddersAsync(CancellationToken cancellationToken = default)
{
    var response = await _http.DeleteAsync($"indexes/{Uid}/settings/embedders", cancellationToken)
        .ConfigureAwait(false);
+   
+   response.EnsureSuccessStatusCode();

    return await response.Content.ReadFromJsonAsync<TaskInfo>(cancellationToken: cancellationToken).ConfigureAwait(false);
}
src/Meilisearch/Embedder.cs (5)

69-82: Remove or implement commented-out properties

The Request and Response properties are commented out but still included in the code. Either implement these properties with proper documentation or remove them completely to avoid confusion.

-///// <summary>
-///// request must be a JSON object with the same structure
-///// and data of the request you must send to your rest embedder.
-///// </summary>
-//[JsonPropertyName("request")]
-//public object Request { get; set; }

-///// <summary>
-///// response must be a JSON object with the same structure
-///// and data of the response you expect to receive from your rest embedder.
-///// </summary>
-//[JsonPropertyName("response")]
-//public object Response { get; set; }

113-116: Add missing XML documentation for EmbedderSource enum

The XML documentation for the EmbedderSource enum is empty.

/// <summary>
-/// 
+/// Defines the source type of an embedder
/// </summary>
[JsonConverter(typeof(EmbedderSourceConverter))]
public enum EmbedderSource

128-130: Fix typo in HuggingFace documentation comment

There's a typo in the documentation for the HuggingFace enum value.

/// <summary>
-/// guggingFace source
+/// huggingFace source
/// </summary>
HuggingFace,

83-87: Add more detailed documentation for the BinaryQuantized property

Consider adding more information about the trade-offs of binary quantization to help developers make informed decisions.

/// <summary>
-/// When set to true, compresses vectors by representing each dimension with 1-bit values. 
+/// When set to true, compresses vectors by representing each dimension with 1-bit values.
+/// This reduces storage requirements but may impact search quality.
/// </summary>
[JsonPropertyName("binaryQuantized")]
public bool? BinaryQuantized { get; set; }

95-111: Consider adding validation for Distribution class properties

The Mean and Sigma properties should be between 0 and 1 according to the documentation, but there's no validation to enforce these constraints.

public class Distribution
{
+   private float? _mean;
+   private float? _sigma;
    
    /// <summary>
    ///  a number between 0 and 1 indicating the semantic score of "somewhat relevant"
    ///  hits before using the distribution setting.
    /// </summary>
    [JsonPropertyName("mean")]
-   public float? Mean { get; set; }
+   public float? Mean 
+   { 
+       get => _mean; 
+       set 
+       {
+           if (value.HasValue && (value < 0.0f || value > 1.0f))
+           {
+               throw new ArgumentOutOfRangeException(nameof(value), "Mean must be between 0.0 and 1.0");
+           }
+           _mean = value;
+       }
+   }

    /// <summary>
    /// a number between 0 and 1 indicating the average absolute difference in
    /// _rankingScores between "very relevant" hits and "somewhat relevant" hits,
    /// and "somewhat relevant" hits and "irrelevant hits".
    /// </summary>
    [JsonPropertyName("sigma")]
-   public float? Sigma { get; set; }
+   public float? Sigma 
+   { 
+       get => _sigma; 
+       set 
+       {
+           if (value.HasValue && (value < 0.0f || value > 1.0f))
+           {
+               throw new ArgumentOutOfRangeException(nameof(value), "Sigma must be between 0.0 and 1.0");
+           }
+           _sigma = value;
+       }
+   }
}
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8250329 and afb2110.

📒 Files selected for processing (19)
  • .code-samples.meilisearch.yaml (1 hunks)
  • src/Meilisearch/Converters/EmbedderSourceConverter.cs (1 hunks)
  • src/Meilisearch/Embedder.cs (1 hunks)
  • src/Meilisearch/Extensions/ObjectExtensions.cs (1 hunks)
  • src/Meilisearch/Index.Documents.cs (2 hunks)
  • src/Meilisearch/Index.Embedders.cs (1 hunks)
  • src/Meilisearch/QueryParameters/DocumentsQuery.cs (1 hunks)
  • src/Meilisearch/QueryParameters/SimilarDocumentsQuery.cs (1 hunks)
  • src/Meilisearch/SearchQuery.cs (1 hunks)
  • src/Meilisearch/Settings.cs (1 hunks)
  • src/Meilisearch/SimilarDocumentsResult.cs (1 hunks)
  • tests/Meilisearch.Tests/Datasets.cs (1 hunks)
  • tests/Meilisearch.Tests/Datasets/movies_with_vector.json (1 hunks)
  • tests/Meilisearch.Tests/DocumentTests.cs (2 hunks)
  • tests/Meilisearch.Tests/IndexFixture.cs (1 hunks)
  • tests/Meilisearch.Tests/Meilisearch.Tests.csproj (1 hunks)
  • tests/Meilisearch.Tests/Movie.cs (2 hunks)
  • tests/Meilisearch.Tests/SearchTests.cs (3 hunks)
  • tests/Meilisearch.Tests/SettingsTests.cs (3 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (2)
src/Meilisearch/Settings.cs (1)
src/Meilisearch/Embedder.cs (1)
  • Embedder (11-88)
src/Meilisearch/QueryParameters/SimilarDocumentsQuery.cs (1)
src/Meilisearch/Embedder.cs (1)
  • Embedder (11-88)
🔇 Additional comments (17)
.code-samples.meilisearch.yaml (1)

840-844: LGTM: New code sample for similar document retrieval

The addition of the get_similar_documents_post_1 sample is well-implemented and demonstrates how to use the new AI-powered search feature for retrieving similar documents.

tests/Meilisearch.Tests/Datasets.cs (1)

20-20: LGTM: Added path for vector dataset

The addition of MoviesWithVectorJsonPath follows the existing pattern for dataset paths and properly supports the new vector search test scenarios.

tests/Meilisearch.Tests/Meilisearch.Tests.csproj (1)

36-36: LGTM: Added vector dataset to project

Correctly configured the new dataset file with CopyToOutputDirectory="PreserveNewest" to ensure it's available during test execution.

src/Meilisearch/QueryParameters/DocumentsQuery.cs (1)

35-39: LGTM: Added RetrieveVectors property

The new RetrieveVectors property is well-implemented with:

  • Clear XML documentation
  • Proper JSON property name annotation
  • Correct nullable boolean type to make it optional

This property enables the retrieval of vector data alongside document results, which is essential for the AI-powered search functionality.

src/Meilisearch/Extensions/ObjectExtensions.cs (1)

65-68: Boolean handling added for query string serialization.

Excellent addition to explicitly handle Boolean values in query parameters, ensuring they're consistently serialized as "true" or "false" strings rather than potentially culture-specific representations.

tests/Meilisearch.Tests/Datasets/movies_with_vector.json (1)

1-32: Well-structured test dataset for vector search.

The test dataset includes all necessary fields for testing vector search functionality, with each movie having title, release_year, id, and 3-dimensional vector embeddings in the _vectors.manual field.

tests/Meilisearch.Tests/DocumentTests.cs (1)

766-783:

✅ Verification successful

Good test implementation for similar documents retrieval.

This test correctly verifies that documents can be retrieved based on vector similarity using the new GetSimilarDocumentsAsync method with the SimilarDocumentsQuery parameters.

Verify that the test correctly reflects the expected behavior by checking if the GetSimilarDocumentsAsync implementation sorts results by similarity score:


🏁 Script executed:

#!/bin/bash
# Find the implementation of GetSimilarDocumentsAsync
rg -A 5 "GetSimilarDocumentsAsync" src/

Length of output: 621


🏁 Script executed:

#!/bin/bash
# Display the implementation of GetSimilarDocumentsAsync with context
rg -n "public async Task<SimilarDocumentsResult<T>> GetSimilarDocumentsAsync" -A 20 src/Meilisearch/Index.Documents.cs

Length of output: 1196


Approve SimilarDocuments retrieval test

The test correctly validates that GetSimilarDocumentsAsync returns documents in descending similarity order as enforced by the Meilisearch /indexes/{Uid}/similar endpoint. The client simply forwards the server-sorted results, so asserting the title sequence is sufficient.

(Optional) To guard against future changes to default server sorting, you could also assert on each hit’s Score value.

tests/Meilisearch.Tests/IndexFixture.cs (1)

112-141: Implementation looks well structured and follows the established pattern.

The SetUpIndexForVectorSearch method follows the same pattern as other setup methods in the class, with proper error handling and task tracking. Good use of JsonSerializer to include detailed error information when an operation fails.

tests/Meilisearch.Tests/SearchTests.cs (2)

15-15: LGTM!

Adding the vector search index field is appropriate and consistent with other index fields in the class.


32-32: LGTM!

Properly initializing the vector search index in the setup method.

tests/Meilisearch.Tests/Movie.cs (1)

78-91: Class implementation looks good and follows Meilisearch conventions.

The MovieWithVector class is properly designed with appropriate JsonPropertyName attributes for mapping between C# properties and JSON field names. The underscore prefix in _Vectors correctly follows Meilisearch's convention for special properties, similar to other classes in this file.

tests/Meilisearch.Tests/SettingsTests.cs (3)

67-67: LGTM: Default embedders initialization

The new default embedders property is correctly initialized as an empty dictionary.


667-716: LGTM: Well-structured tests for embedders functionality

The new test methods for embedders follow the established pattern in the test class. The tests cover getting, updating, and resetting embedders, which provides good coverage for the new functionality.


737-738: LGTM: Updated helper method

The SettingsWithDefaultedNullFields method has been correctly updated to handle the new Embedders property.

src/Meilisearch/SearchQuery.cs (1)

152-174: Well-structured new properties for AI-powered search capabilities

The new properties for AI-powered search are well-documented and properly annotated with JsonPropertyName attributes. The additions logically extend the SearchQuery functionality to support vector search and hybrid search capabilities.

src/Meilisearch/Index.Embedders.cs (1)

1-20: Properly implemented API method for retrieving embedders

The GetEmbeddersAsync method follows the established patterns in the codebase, with proper async/await usage, ConfigureAwait(false), and cancellation token support.

src/Meilisearch/Embedder.cs (1)

11-88: Well-designed Embedder class with comprehensive property documentation

The Embedder class provides a robust model for configuring AI embedders with well-documented properties. The design allows for flexible configuration of different embedder types.

Comment on lines +438 to +440
throw new MeilisearchCommunicationError(
Constants.VersionErrorHintMessage(e.Message, nameof(GetDocumentsAsync)), e);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Method reference in error message doesn't match the current method.

The error message references GetDocumentsAsync instead of GetSimilarDocumentsAsync, which could be confusing during troubleshooting.

throw new MeilisearchCommunicationError(
-    Constants.VersionErrorHintMessage(e.Message, nameof(GetDocumentsAsync)), e);
+    Constants.VersionErrorHintMessage(e.Message, nameof(GetSimilarDocumentsAsync)), e);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
throw new MeilisearchCommunicationError(
Constants.VersionErrorHintMessage(e.Message, nameof(GetDocumentsAsync)), e);
}
throw new MeilisearchCommunicationError(
Constants.VersionErrorHintMessage(e.Message, nameof(GetSimilarDocumentsAsync)), e);
}

Comment on lines +7 to +71
/// <summary>
/// A class that handles the creation of a query string for similar Documents.
/// </summary>
public class SimilarDocumentsQuery
{
/// <summary>
/// Identifier of the target document
/// </summary>
[JsonPropertyName("id")]
public object Id { get; set; }

/// <summary>
/// Embedder name to use when computing recommendations
/// </summary>
[JsonPropertyName("embedder")]
public string Embedder { get; set; }

/// <summary>
/// Attributes to display in the returned documents
/// </summary>
[JsonPropertyName("attributesToRetrieve")]
public IEnumerable<string> AttributesToRetrieve { get; set; }

/// <summary>
/// Gets or sets the offset.
/// </summary>
[JsonPropertyName("offset")]
public int? Offset { get; set; }

/// <summary>
/// Gets or sets the limit.
/// </summary>
[JsonPropertyName("limit")]
public int? Limit { get; set; }

/// <summary>
/// Filter queries by an attribute's value
/// </summary>
[JsonPropertyName("filter")]
public string Filter { get; set; }

/// <summary>
/// Display the global ranking score of a document
/// </summary>
[JsonPropertyName("showRankingScore")]
public bool? ShowRankingScore { get; set; }

/// <summary>
/// Display detailed ranking score information
/// </summary>
[JsonPropertyName("showRankingScoreDetails")]
public bool? ShowRankingScoreDetails { get; set; }

/// <summary>
/// Exclude results with low ranking scores
/// </summary>
[JsonPropertyName("rankingScoreThreshold")]
public float? RankingScoreThreshold { get; set; }

/// <summary>
/// Return document vector data
/// </summary>
[JsonPropertyName("retrieveVectors")]
public bool? RetrieveVectors { get; set; }
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Missing validation for required properties

The SimilarDocumentsQuery class doesn't validate that required properties like Id and potentially Embedder are set before use. Consider adding validation or using a constructor to ensure these properties are initialized.

 public class SimilarDocumentsQuery
 {
+    /// <summary>
+    /// Initializes a new instance of the <see cref="SimilarDocumentsQuery"/> class with the required parameters.
+    /// </summary>
+    /// <param name="id">Identifier of the target document</param>
+    public SimilarDocumentsQuery(object id)
+    {
+        Id = id ?? throw new System.ArgumentNullException(nameof(id));
+    }
+
     /// <summary>
     /// Identifier of the target document
     /// </summary>
     [JsonPropertyName("id")]
-    public object Id { get; set; }
+    public object Id { get; }

@curquiza
Copy link
Member

curquiza commented Aug 8, 2025

Hello @nipeone
If you are still here, thank you for your PR! And can you fix the test (integration-tests and format-check)? thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[v1.13] Stabilize AI-powered search
3 participants