Skip to content

Commit 5036908

Browse files
authored
Update doc
Differential Revision: D79030332 Pull Request resolved: #854
1 parent ac33906 commit 5036908

File tree

2 files changed

+15
-7
lines changed

2 files changed

+15
-7
lines changed

docs/source/getting_started/parallelism.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -76,6 +76,7 @@ method.
7676
For loading raw byte strings into array format, SPDL offers efficient
7777
functions through :py:mod:`spdl.io` module.
7878

79+
.. _pipeline-parallelism-custom-mt:
7980

8081
Multi-threading (custom)
8182
------------------------
@@ -100,8 +101,7 @@ instance, or put it in a
100101
The following example shows how to initialize and store a CUDA stream
101102
in a thread-local storage.
102103

103-
.. admonition::
104-
:class: note
104+
.. note::
105105

106106
The following code is now available as :py:func:`spdl.io.transfer_tensor`.
107107

src/spdl/io/_transfer.py

Lines changed: 13 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -116,13 +116,21 @@ def _get_trancfer_func() -> _DataTransfer:
116116

117117

118118
def transfer_tensor(batch: T, /) -> T:
119-
"""Transfers PyTorch CPU Tensors to CUDA in a background.
119+
"""Transfers PyTorch CPU Tensors to CUDA in a dedicated stream.
120120
121121
This function wraps calls to :py:meth:`torch.Tensor.pin_memory` and
122-
:py:meth:`torch.Tensor.to`, and execute them in a dedicated CUDA stream,
123-
so that when called in a background thread, data transfer is carried out
124-
in a way it overlaps with the GPU computation happening in the foreground
125-
thread (such as training and inference).
122+
:py:meth:`torch.Tensor.to`, and execute them in a dedicated CUDA stream.
123+
124+
When called in a background thread, the data transfer overlaps with
125+
the GPU computation happening in the foreground thread (such as training
126+
and inference).
127+
128+
.. seealso::
129+
130+
:ref:`pipeline-parallelism-custom-mt` - An intended way to use
131+
this function in :py:class:`~spdl.pipeline.Pipeline`.
132+
133+
.. image:: ../../_static/data/parallelism_transfer.png
126134
127135
Concretely, it performs the following operations.
128136

0 commit comments

Comments
 (0)