You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[compiler toolkit] Prepare deepseek to accept graph passes (#1982)
Made some updates to improve UX when running experiments in compiler
toolkit
- Always register block mask as pytree node. A model could use flex_attn
even it's flavor doesn't contain `flex_attn`
- Prepare deepseek v3 to accept graph passes like llama3
- Annotate flex attention in deepseek v3
- Regional inductor doesn't work on deepseek with flex attn with error
P2021796847
To repro the regional inductor issue in dsv3, uncomment
`regional_inductor()` and run
```
NGPU=4 CONFIG_FILE=./torchtitan/models/deepseek_v3/train_configs/debug_model.toml ./run_train.sh --model.name compiler_toolkit.deepseek_v3 --parallelism.data_parallel_shard_degree=2 --parallelism.tensor_parallel_degree=2 --parallelism.expert_parallel_degree=2 --activation_checkpoint.mode none --model.flavor=debugmodel_flex_attn
```
0 commit comments