Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
77 changes: 77 additions & 0 deletions .github/workflows/linux_cuda_aarch64_wheel.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -56,3 +56,80 @@ jobs:
architecture: aarch64
build-platform: "python-build-package"
build-command: "BUILD_AGAINST_ALL_FFMPEG_FROM_S3=1 ENABLE_CUDA=1 python -m build --wheel -vvv --no-isolation"

install-and-test:
runs-on: linux.arm64.2xlarge
container:
image: pytorch/manylinuxaarch64-builder:cuda12.6
env:
cuda_version_without_periods: "126"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This cuda version is specified outside of the matrix, as done in linux_cuda_wheel. Is that intentional?

matrix:
# 3.10 corresponds to the minimum python version for which we build
# the wheel unless the label cliflow/binaries/all is present in the
# PR.
# For the actual release we should add that label and change this to
# include more python versions.
python-version: ['3.10']
# We test against 12.6 and 13.0 to avoid having too big of a CI matrix,
# but for releases we should add 12.8.
cuda-version: ['12.6', '13.0']

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch - yes that's intentional. In this new job we only test on a CUDA 12.6 machine, not a any other CUDA versions (see image: pytorch/manylinuxaarch64-builder:cuda12.6) above.

We don't need to test more CUDA versions because we can't run CUDA tests on ARM anyway (that's a test-infra limitation, from what I understand). And we still have to use a CUDA docker image because the wheels were built with CUDA support.

strategy:
fail-fast: false
matrix:
python-version: ['3.10']
ffmpeg-version-for-tests: ['4.4.2', '5.1.2', '6.1.1', '7.0.1', '8.0']
needs: build
steps:
- uses: actions/download-artifact@v4
with:
name: meta-pytorch_torchcodec__${{ matrix.python-version }}_cu${{ env.cuda_version_without_periods }}_aarch64
path: pytorch/torchcodec/dist/
- name: Setup conda env
uses: conda-incubator/setup-miniconda@v3
with:
auto-update-conda: true
# Using miniforge instead of miniconda ensures that the default
# conda channel is conda-forge instead of main/default. This ensures
# ABI consistency between dependencies:
# https://conda-forge.org/docs/user/transitioning_from_defaults/
miniforge-version: latest
activate-environment: test
python-version: ${{ matrix.python-version }}
- name: Update pip
run: python -m pip install --upgrade pip
- name: Install PyTorch
run: |
${CONDA_RUN} python -m pip install --pre torch torchvision --index-url https://download.pytorch.org/whl/nightly/cu${{ env.cuda_version_without_periods }}
- name: Install torchcodec from the wheel
run: |
wheel_path=`find pytorch/torchcodec/dist -type f -name "*.whl"`
echo Installing $wheel_path
python -m pip install $wheel_path -vvv

- name: Check out repo
uses: actions/checkout@v3
- name: Install ffmpeg, post build
run: |
# Ideally we would have checked for that before installing the wheel,
# but we need to checkout the repo to access this file, and we don't
# want to checkout the repo before installing the wheel to avoid any
# side-effect. It's OK.
source packaging/helpers.sh
assert_ffmpeg_not_installed

conda install "ffmpeg=${{ matrix.ffmpeg-version-for-tests }}" -c conda-forge
ffmpeg -version
echo LD_LIBRARY_PATH=$CONDA_PREFIX/lib:/usr/local/cuda/lib64/:${LD_LIBRARY_PATH} >> $GITHUB_ENV

- name: Install test dependencies
run: |
# Ideally we would find a way to get those dependencies from pyproject.toml
python -m pip install numpy pytest pillow

- name: Delete the src/ folder just for fun
run: |
# The only reason we checked-out the repo is to get access to the
# tests. We don't care about the rest. Out of precaution, we delete
# the src/ folder to be extra sure that we're running the code from
# the installed wheel rather than from the source.
# This is just to be extra cautious and very overkill because a)
# there's no way the `torchcodec` package from src/ can be found from
# the PythonPath: the main point of `src/` is precisely to protect
# against that and b) if we ever were to execute code from
# `src/torchcodec`, it would fail loudly because the built .so files
# aren't present there.
rm -r src/
ls
- name: Run Python tests
run: |
pytest --override-ini="addopts=-v" test
7 changes: 6 additions & 1 deletion test/test_encoders.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
import io
import json
import os
import platform
import re
import subprocess
import sys
Expand Down Expand Up @@ -328,7 +329,11 @@ def test_against_cli(

assert_close = torch.testing.assert_close
if sample_rate != asset.sample_rate:
rtol, atol = 0, 1e-3
if platform.machine().lower() == "aarch64":
rtol, atol = 0, 1e-2
else:
rtol, atol = 0, 1e-3

if sys.platform == "darwin":
assert_close = partial(assert_tensor_close_on_at_least, percentage=99)
elif format == "wav":
Expand Down
4 changes: 3 additions & 1 deletion test/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
import json
import os
import pathlib
import platform
import subprocess
import sys

Expand Down Expand Up @@ -138,7 +139,7 @@ def psnr(a, b, max_val=255) -> float:
# not guarantee bit-for-bit equality across systems and architectures, so we
# also cannot. We currently use Linux on x86_64 as our reference system.
def assert_frames_equal(*args, **kwargs):
if sys.platform == "linux":
if sys.platform == "linux" and "x86" in platform.machine().lower():
if args[0].device.type == "cuda":
atol = 3 if cuda_version_used_for_building_torch() >= (13, 0) else 2
if get_ffmpeg_major_version() == 4:
Expand All @@ -150,6 +151,7 @@ def assert_frames_equal(*args, **kwargs):
else:
torch.testing.assert_close(*args, **kwargs, atol=0, rtol=0)
else:
# Here: Windows, MacOS, and Linux for non-x86 architectures like aarch64
torch.testing.assert_close(*args, **kwargs, atol=3, rtol=0)


Expand Down
Loading