Skip to content

Conversation

oraNod
Copy link
Contributor

@oraNod oraNod commented Sep 12, 2025

Opening this as a draft so we can discuss it at the next DaWGs.

The sphinx-llms-txt extension generates an llms.txt file and a combined llms-full.txt file. These files support the llms.txt standard that aims to make docsites more LLM-friendly.

It's kind of hacky right now but, for the purposes of POC, try it out by passing LLM_DEFINES="-D html_copy_source=True" with your make invocation.

Attaching generated files for preview:

Note that llms-full.txt is 329.79 MB when generated so I've truncated it after line 1000.

@oraNod oraNod added the DaWGs Good discussion item for the DaWGs label Sep 12, 2025
@oraNod oraNod requested a review from webknjaz September 12, 2025 18:50
@oraNod oraNod changed the title Llm txt Adopting LLMs txt file for package docs latest Sep 15, 2025
@oraNod oraNod changed the title Adopting LLMs txt file for package docs latest Adopting LLMs txt file for latest package docs Sep 15, 2025
@gundalow
Copy link
Contributor

@jamestalton mentioned that while this will help, having it reference markdown files is even better as they are more easily injested for training.

I know updating the Makefile to generate markdown is more involved, though Sphinx can export Markdown

James: If you can provide some extra context, that would be great

@oraNod
Copy link
Contributor Author

oraNod commented Sep 15, 2025

@jamestalton mentioned that while this will help, having it reference markdown files is even better as they are more easily injested for training.

I know updating the Makefile to generate markdown is more involved, though Sphinx can export Markdown

James: If you can provide some extra context, that would be great

llms.txt actually is a markdown file. I know that's not immediately obvious though: https://llmstxt.org/#format

@gundalow
Copy link
Contributor

Sorry, my fault for typing to fast.
I mean the files that llms.txt links to should be Markdown, as they are more easily parsed.

ie https://base-ui.com/react/overview/about is also published as https://base-ui.com/react/overview/about.md and the Markdown version is the one linked in https://base-ui.com/llms.txt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
DaWGs Good discussion item for the DaWGs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants