[docs-Infra] Update indexName to 'material-ui-v5' for v5 Search #47049

dav-is · 2025-10-08T02:10:28Z

Uses a separate index based on https://v5.mui.com/. We remove the master filter because the version is set to v5 everywhere except Toolpad, which has master as its version. There are only two versions, so there's little reason to filter.

Adds a new crawler that crawls once a month.

Fix: #45771

I have followed (at least) the PR section of the contributing guide.

… search parameters

mui-bot · 2025-10-08T02:14:54Z

Netlify deploy preview

https://deploy-preview-47049--material-ui.netlify.app/

Bundle size report

No bundle size changes (Toolpad)
No bundle size changes

Generated by 🚫 dangerJS against 0eb6aff

Janpot · 2025-10-08T08:46:11Z

So the alternative would be to keep a single index, but versioned? Just creating a crawler per version that indexes in the same index.

dav-is · 2025-10-08T13:10:47Z

@Janpot Why put old data into the main index? The index has page content in it. We can have up to 20 indexes in Algolia, so why create a large monolithic index when it can be partitioned according to usage?

Janpot · 2025-10-08T15:29:43Z

Why put old data into the main index?

I haven't thought too deeply about it, it was just intuition. But thinking a bit about it:

Less forking of the frontend search code across major versions, the search code can just filter based on an env var that probably already exists
We're currently only using 0.1% of the available space or so, I wouldn't necessarily call it monolithic.
Would allow for decoupled release cycles of the sub products. i.e. it strongly couples our index creation to the idea that we do synchronised releases across the products.
Allows for searching across versions should we want that at some point. (don't have an immediate use-case)
But maybe some day we want to index every minor version? Maybe with LLMs it potentially may become important to have the ability to do finer grained search per version. In that case we're soon going to need more than 20 indices.

dav-is · 2025-10-08T16:46:22Z

@Janpot

Less forking of the frontend search code across major versions, the search code can just filter based on an env var that probably already exists

The index name can just as easily be an ENV variable.

We're currently only using 0.1% of the available space or so, I wouldn't necessarily call it monolithic.

If we are extending the idea of scaling multiple major versions to new packages, such as Base UI, a 40MB index (the size of today's index) could be considered quite significant and near the limits of fitting into a serverless function.

Would allow for decoupled release cycles of the sub products. i.e., it strongly couples our index creation to the idea that we do synchronized releases across the products.

I think of previous major versions as an archive. They exist on a dedicated branch and receive mostly backports for serious fixes. There are also some logistical hurdles to maintaining a separate branch when packages have decoupled releases (how "v7" of MUI X is frozen to 7.0.2 of Material UI). We treat major releases as entirely separate "branches" of content; otherwise, why would we have a subdomain? If we had the docs of each major version stored in a subdirectory maintained in the master branch, then that might be a case for a single index, but I don't think that's a scalable approach either.

Major versions are the only time when deprecated features are deliberately removed, so with a new major version, we also get to prune the index. With a monolithic index, each major version significantly increases the index's size. With each major version, we are choosing to remove unhelpful context.

You would expect the latest index to receive a lot of traffic, the previous major version to receive less, and the major versions before that to receive significantly less traffic. This is why lumping them into one seems monolithic to me.

There is also precedent to the idea of splitting the index by version: v4.mui.com uses a separate index and today works correctly, even though that branch and index probably haven't been touched in a long time.

If indexes combine versions, then they become dependent on the crawler or a database to create the index. We can no longer assume that it is produced by the site content we have checked out in git. An index created for a PR would depend on outside information (or it would work differently from production).

But maybe some day we want to index every minor version? Maybe with LLMs it potentially may become important to have the ability to do finer grained search per version. In that case we're soon going to need more than 20 indices.

It would make sense for minor versions to share an index. They ideally have more in common with one another and will evolve. Ideally, the content would reference older minors explicitly, e.g. "The checkbox component was added in v0.5.0". Then you could search for all features released in v0.4.0 vs v0.6.0.

Maybe with LLMs it potentially may become important to have the ability to do finer grained search per version.

I think with LLMs it is important to filter out information that might be misleading. My feeling would be that even glancing at content from a previous major could confuse an LLM. A major version is meant to be cohesive, and previous majors won't be considering future capabilities or improvements. For example, maybe a page on the last major suggests using a deprecated function that has since been removed. This recommendation would be deliberately removed in the latest version; however, if you're running the older version, the recommendation remains valid, and removing it would also be incorrect.

Allows for searching across versions should we want that at some point. (don't have an immediate use-case)

If we needed this, I think it would be on a separate page or context from the global docs search. We could create a heavier aggregate index for this case specifically. We could also optimize this index for the particular case, maybe we would exclude the content itself, or maybe we would add more metadata.

[docs-infra] Update Algolia index name to 'material-ui-v5' and adjust…

0eb6aff

… search parameters

dav-is added type: bug It doesn't behave as expected. scope: docs-infra Involves the docs-infra product (https://www.notion.so/mui-org/b9f676062eb94747b6768209f7751305). labels Oct 8, 2025

dav-is mentioned this pull request Oct 8, 2025

[docs-infra] Algolia search results targeting FAQ page results in 404 with v6 (and below) #45771

Open

dav-is requested a review from Janpot October 8, 2025 02:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[docs-Infra] Update indexName to 'material-ui-v5' for v5 Search #47049

[docs-Infra] Update indexName to 'material-ui-v5' for v5 Search #47049

dav-is commented Oct 8, 2025 •

edited

Loading

Uh oh!

mui-bot commented Oct 8, 2025 •

edited

Loading

Uh oh!

Janpot commented Oct 8, 2025

Uh oh!

dav-is commented Oct 8, 2025

Uh oh!

Janpot commented Oct 8, 2025 •

edited

Loading

Uh oh!

dav-is commented Oct 8, 2025

Uh oh!

Uh oh!

Uh oh!

[docs-Infra] Update indexName to 'material-ui-v5' for v5 Search #47049

Are you sure you want to change the base?

[docs-Infra] Update indexName to 'material-ui-v5' for v5 Search #47049

Conversation

dav-is commented Oct 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mui-bot commented Oct 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Netlify deploy preview

Bundle size report

Uh oh!

Janpot commented Oct 8, 2025

Uh oh!

dav-is commented Oct 8, 2025

Uh oh!

Janpot commented Oct 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dav-is commented Oct 8, 2025

Uh oh!

Uh oh!

dav-is commented Oct 8, 2025 •

edited

Loading

mui-bot commented Oct 8, 2025 •

edited

Loading

Janpot commented Oct 8, 2025 •

edited

Loading