Add regional AoT compilation #3057

sayakpaul · 2025-09-05T08:21:48Z

No description provided.

cbensimon · 2025-09-08T10:56:26Z

@sayakpaul after re-thinking about regional compilation, I think that the current process is still a bit too complex to be included in the blogpost. I think that simplifying this process at library level (either in spaces package or diffusers) by leveraging ModelMixin._repeated_blocks might be worth it.

sayakpaul · 2025-09-08T13:40:00Z

@cbensimon good point.

However, I think since the post is the only go-to resource for the devs (building on ZeroGPU) out there, it's nice to include the regional compilation section. Once we have an API in spaces or anywhere else, we can simply swap it back.

Regarding using ModelMixin._repeated_blocks, I think it will only work for diffusers models. But our solutions are generic. So, exposing an API from spaces in an agnostic manner makes more sense. WDYT?

Vaibhavs10

probably okay to go ahead and merge this as-is and then you can refine as you abstract away the complexities a bit more

sayakpaul · 2025-09-09T10:50:49Z

Yeah pretty much. Things are already in progress, so should be just a few days once we swap out things from here.

So, waiting for Charles to hear what he thinks.

zerogpu-aoti.md

cbensimon · 2025-09-10T14:14:29Z

zerogpu-aoti.md

- [LTX Video](https://huggingface.co/spaces/zerogpu-aoti/ltx-dev-fast)
+
+### Regional compilation
+- [Regional compilation recipe](https://docs.pytorch.org/tutorials/recipes/regional_compilation.html)


I initially thought that it was your recent tutorial on regional AoT. Still nice to include this one though

It's about to be merged: pytorch/tutorials#3543

cbensimon · 2025-09-10T14:16:10Z

Approved. Only TODO link left @sayakpaul (link to the push and re-use collection)

Co-authored-by: Charles <[email protected]>

sayakpaul · 2025-09-10T14:46:44Z

Will merge after updating the link.

zerogpu-aoti.md

pcuenca · 2025-09-10T20:33:40Z

zerogpu-aoti.md

+
+In our example, we can compile the repeated blocks of the Flux transformer ahead of time like so. The [Flux Transformer](https://github.com/huggingface/diffusers/blob/c2e5ece08bf22d249c62e964f91bc326cf9e3759/src/diffusers/models/transformers/transformer_flux.py) has two kinds of repeated blocks: `FluxTransformerBlock` and `FluxSingleTransformerBlock`.
+
+You can check out [this Space](https://huggingface.co/spaces/zerogpu-aoti/Qwen-Image-Edit-AoT-Regional) for a complete example.


This code was clarifying to me, rather than the demo space itself. Perhaps we could link to both and use the code to illustrate the explanations.

However, I only see pipeline.transformer.transformer_blocks[0] being compiled, whereas we mentioned two different kinds of repeated blocks in the description.

The writing demonstrates with Flux. The demo uses Qwen which has a single block. I have changed the link to Flux from @cbensimon. But just a link to the demo is fine, IMO.

pcuenca · 2025-09-10T20:35:03Z

zerogpu-aoti.md

+### Use a compiled graph from the Hub
+
+Once a model (or even a model block) is compiled ahead of time, we can serialize the compiled graph module
+as an artifact and reuse later. In the context of a ZeroGPU-powered demo on Spaces, this will significantly
+cut down the demo startup time.
+
+To keep the storage light, we can just save the compiled model graph without including any model parameters
+inside the artifact.
+
+Check out [this collection](TODO) that shows a full workflow of obtaining compiled model graph, pushing it
+to the Hub, and then using it to build a demo. 


I don't understand this section. What are the benefits of persisting the serialization vs the code demonstrated in the previous example? Also, the collection is missing.

Also, the collection is missing.

#3057 (comment)

I don't understand this section. What are the benefits of persisting the serialization vs the code demonstrated in the previous example?

We skip the compilation time reusing a compiled graph.

Co-authored-by: Pedro Cuenca <[email protected]>

sayakpaul added 2 commits September 5, 2025 13:23

aot comments.

f47e955

up

fe67ff3

sayakpaul requested a review from cbensimon September 5, 2025 08:21

up

b190ad4

Vaibhavs10 approved these changes Sep 7, 2025

View reviewed changes

Vaibhavs10 reviewed Sep 9, 2025

View reviewed changes

sayakpaul added 2 commits September 10, 2025 10:19

add a section on reusing a compiled model.

783752b

toc.

de28ccd

cbensimon reviewed Sep 10, 2025

View reviewed changes

zerogpu-aoti.md Outdated Show resolved Hide resolved

cbensimon reviewed Sep 10, 2025

View reviewed changes

cbensimon approved these changes Sep 10, 2025

View reviewed changes

Update zerogpu-aoti.md

fac2ac6

Co-authored-by: Charles <[email protected]>

pcuenca reviewed Sep 10, 2025

View reviewed changes

sayakpaul and others added 4 commits September 11, 2025 07:58

up

57aa721

up

5fe5025

up

b8b6779

Update zerogpu-aoti.md

ad2fbc4

Co-authored-by: Pedro Cuenca <[email protected]>

sayakpaul merged commit 3085f9f into main Sep 11, 2025

sayakpaul deleted the regional-aot branch September 11, 2025 15:33


		In our example, we can compile the repeated blocks of the Flux transformer ahead of time like so. The [Flux Transformer](https://github.com/huggingface/diffusers/blob/c2e5ece08bf22d249c62e964f91bc326cf9e3759/src/diffusers/models/transformers/transformer_flux.py) has two kinds of repeated blocks: `FluxTransformerBlock` and `FluxSingleTransformerBlock`.

		You can check out [this Space](https://huggingface.co/spaces/zerogpu-aoti/Qwen-Image-Edit-AoT-Regional) for a complete example.

Add regional AoT compilation #3057

Add regional AoT compilation #3057

Uh oh!

Conversation

sayakpaul commented Sep 5, 2025

Uh oh!

cbensimon commented Sep 8, 2025

Uh oh!

sayakpaul commented Sep 8, 2025

Uh oh!

Vaibhavs10 left a comment

Choose a reason for hiding this comment

Uh oh!

sayakpaul commented Sep 9, 2025

Uh oh!

Uh oh!

cbensimon Sep 10, 2025

Choose a reason for hiding this comment

Uh oh!

cbensimon Sep 10, 2025

Choose a reason for hiding this comment

Uh oh!

sayakpaul Sep 10, 2025

Choose a reason for hiding this comment

Uh oh!

cbensimon commented Sep 10, 2025

Uh oh!

sayakpaul commented Sep 10, 2025

Uh oh!

Uh oh!

pcuenca Sep 10, 2025

Choose a reason for hiding this comment

Uh oh!

sayakpaul Sep 11, 2025

Choose a reason for hiding this comment

Uh oh!

pcuenca Sep 10, 2025

Choose a reason for hiding this comment

Uh oh!

sayakpaul Sep 11, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants