Can we move the steps from write_doc_serialized into write_doc? #13835
Ch3ri0ur
started this conversation in
Feature requests and ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
During investigations to improve parallel execution (during write) I noticed that that serial part is the cause of quite some headaches.
It makes it necessary for the main process to load and prepare doctrees, before sending the batches of to be processed.
The workflow in
_write_parallel
is a bit unfortunate, because there will always be a last batch that needs to be processed, while all others wait.I tried working with queues but I think the overhead of sending the doctrees is too great. The "actual" fix would be to move
get_and_resolve_doctree
also into the processes, which is not possible while we have serial operations.I saw that in 6.1.x there were already commits to do so. But it broke the build for some and was reverted.
See related issues:
#11100
#11117
#11163
#11192
Introduced here:
b32841e
reverts:
a1cd19e
2a7c40d
#11192
It would be great if we could find a way to make the parallel write fully parallel.
It looks like only the html builder is using write_doc_serialized. (So no impact on other builders)
The
doctree-resolved
event is in there. So currently all modifications on the env during that would could be depended on by later events.That might not work if they are fully in separate processes with no merging logic.
For sphinx parts (e.g. indexer) we could maybe add some merging logic.
@AA-Turner do you have some insights on this issue (to shorten the investigation)? Are there some already known pitfalls/blockers or other impacts?
Beta Was this translation helpful? Give feedback.
All reactions