Skip to content

Conversation

Blargian
Copy link
Member

@Blargian Blargian commented Aug 27, 2025

Summary

Some small improvements:

  • Expand to work with .md or .mdx
  • Don't wrap terminology which is in headers (e.g. ## What are parts should not become ## What are ^^Parts^^
  • Don't run the processor on every file in the docs, only on the folders which we ran wrap-glossary-terms.py on
  • Fix README.md
  • Remove ^^ ^^ from some parts of the docs which don't need to be wrapped (Let's begin with core-concepts, quick-starts), we can expand in future.

Future improvements:

  • If a page contains x amount of glossary terms, only mark one per paragraph. I.e
This is a made up paragraph about ^^parts^^, which contains a lot of references to ^^parts^^. I mean like really a lot of references to ^^parts^^. So many ^^parts^^. All the ^^parts^^. ClickHouse has the best ^^parts^^.

could just be

This is a made up paragraph about ^^parts^^, which contains a lot of references to parts. I mean like really a lot of references to parts. So many parts. All the parts. ClickHouse has the best parts.

We'll need to add a preprocessing step to remove ^^ ^^ before translating, or figure out a way to make this multilingual.

Checklist

…re already wrapped, only do concepts pages (and quick-starts) and only run the processor for specific paths (core concepts, quick starts for now)
@Blargian Blargian requested a review from a team as a code owner August 27, 2025 10:44
Copy link

vercel bot commented Aug 27, 2025

@Blargian is attempting to deploy a commit to the ClickHouse Team on Vercel.

A member of the Team first needs to authorize it.

Copy link

vercel bot commented Aug 27, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Updated (UTC)
clickhouse-docs Ready Ready Preview Aug 28, 2025 7:08pm

@Blargian Blargian changed the title Glossary: modifications to glossary script Glossary: modifications to glossary system Aug 27, 2025
@Blargian Blargian requested a review from dhtclk August 27, 2025 10:51
return filePath?.includes(allowedPath);
});

return (filePath?.endsWith('.mdx') || filePath?.endsWith('.md')) &&
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dhtclk correct me if i'm wrong but I don't see why this won't work on .md files?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

on build time, the plugin would look for "^^ ^^" but it would only work properly on mdx files because it's injecting our custom react components, which aren't parsed properly by md files because they can't parse JSX, only standard markdown syntax.

Comment on lines +53 to +57
const allowedPaths = [
'docs/concepts/*',
'docs/managing-data/core-concepts/*',
'docs/getting-started/*'
];
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We seem to be processing all the files when only a small portion have ^^ ^^ so let's just do a subset. We can add as we go along.

Copy link
Collaborator

@dhtclk dhtclk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we'll just have to adjust the glossary-transformer so that it only processes mdx files.

UPDATE: nevermind I was wrong :D

@Blargian
Copy link
Member Author

we'll just have to adjust the glossary-transformer so that it only processes mdx files.

Oh ye of little faith :-) I pushed a commit to rename parts.mdx to parts.md. The transformation is just from a custom ^^ ^^ syntax to an HTML-like one (it's not even valid React because we're not importing the component into the file). Docusaurus is transforming both .md and .mdx to HTML at the end of the day, so from Reacts point of view it works.

Processed 28 glossary terms in: merges.mdx
Processed 13 glossary terms in: parts.md <--- 
Processed 19 glossary terms in: primary-indexes.mdx

https://clickhouse-docs-f4wde8985-clickhouse.vercel.app/docs/parts#part-merges

@Blargian Blargian merged commit 56cdb43 into ClickHouse:main Aug 28, 2025
10 of 13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Tool Tips in Headers show syntax in the side nav Convert .md files to .mdx to enable glossary system
2 participants