Skip to content

Commit 681c66f

Browse files
authored
remove TORCH_NCCL_AVOID_RECORD_STREAMS env var setting (#1088)
as I'm seeing the following warning ``` Warning: TORCH_NCCL_AVOID_RECORD_STREAMS is the default now, this environment variable is thus deprecated. (function operator()) ``` from https://github.com/pytorch/pytorch/blob/main/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp#L997
1 parent 5707c3d commit 681c66f

File tree

1 file changed

+0
-5
lines changed

1 file changed

+0
-5
lines changed

torchtitan/distributed/utils.py

Lines changed: 0 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -207,11 +207,6 @@ def _get_distributed_backend(job_config):
207207
os.makedirs(dump_dir, exist_ok=True)
208208
_warn_overwrite_env(TRACE_FILE, f"{dump_dir}/rank_")
209209

210-
# to mitigate the memory issue that collectives using
211-
# async_op=True hold memory longer than they should
212-
# such as those in tensor parallelism
213-
os.environ["TORCH_NCCL_AVOID_RECORD_STREAMS"] = "1"
214-
215210
torch.distributed.init_process_group(
216211
backend=_get_distributed_backend(job_config),
217212
timeout=timedelta(seconds=job_config.comm.init_timeout_seconds),

0 commit comments

Comments
 (0)