-
Notifications
You must be signed in to change notification settings - Fork 77
Add pixel formats and color handling to VideoEncoder GPU #1125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| {-0.148f, -0.291f, 0.439f, 128.0f}, | ||
| // V = 0.439*R - 0.368*G - 0.071*B + 128 (BT.601 coefficients) | ||
| {0.439f, -0.368f, -0.071f, 128.0f}}; | ||
| // RGB to YUV conversion matrices to use in NPP color conversion functions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you share how these were derived? What were the original values that were used as reference?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These follow the pattern described in the note in CudaCommon, I can add a comment referencing that note here
torchcodec/src/torchcodec/_core/CUDACommon.cpp
Lines 43 to 44 in ee8ce04
| // Color space and color range | |
| // --------------------------- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The note is about YUV -> RGB so it's not 100% targeted to what the matrices are doing. But yes, add a ref to that note, it's still useful.
You asked me offline whether we should update the note to explain limited range: yes, we should :)
There's a TODO in the note for that, but I never had the chance to do it - and frankly I forgot the underlying logic lol. If you'd like to give it a go, please do it - in a follow-up PR.
test/test_encoders.py
Outdated
| assert encoder_metadata["color_range"] == ffmpeg_metadata["color_range"] | ||
| assert encoder_metadata["color_space"] == ffmpeg_metadata["color_space"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll want to be stricter here:
| assert encoder_metadata["color_range"] == ffmpeg_metadata["color_range"] | |
| assert encoder_metadata["color_space"] == ffmpeg_metadata["color_space"] | |
| assert encoder_metadata["color_range"] == ffmpeg_metadata["color_range"] == color_range | |
| assert encoder_metadata["color_space"] == ffmpeg_metadata["color_space"] == color_space |
| assert encoder_metadata["pix_fmt"] == "yuv420p" | ||
| assert ffmpeg_metadata["pix_fmt"] == "yuv420p" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like we're not assert pix_fmt anymore, which makes it hard to verify that the changes in this PR are correct. IIRC, passing NV12 actually resulted in a yuv420p format at the end. We should try to undertand why that was the case. It may not add a lot of value to support both nv12 and yuv420p as parameter values if they're both the same thing (and if they both end-up being yuv420 anyway).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect the format changes occur based on codec implementation. By adding back this assertion, I observed a deprecated pixel format yuvj420p is set when pc (full) color range is used by h264_nvenc and hevc_nvenc, but not av1_nvenc.
I can incorporate pixel formats into my benchmarking PR, to see if there is some optimization to using nv12.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new assertions you added are good. But I personally still do not understand why passing NV12 actually ends up being reported yuv420.
Is it actually still NV12, and it's just FFmpeg that can't tell the difference? Or is it indeed yuv420?
Until we get a good understanding on that, I think we should refrain from allowing pixel_format with CUDA encoding. We have a surprising behavior that we cannot explain right now: passing NV12 leads to yuv420. If we're surprised, our users will be surprised too, and we won't have a good explanation to give them. It is safer to simply not expose this functionality just yet, and let them rely on the default behavior.
This PR resolves these TODOs:
// TODO-VideoEncoder: Enable configuration of color properties, similar to FFmpeg.// TODO-VideoEncoder: Enable user set pixel formats to be set and handled with the appropriate NPP functionTo address the first TODO, it adds support for the following parameters:
Color spaces:
BT.601,BT.709,BT.2020Color ranges:
tv(limited),pc(full)ColorConversionMatricesare stored and utilized for NPP color conversion functions.To address the second TODO, the correct NPP functions are called to support the pixel formats
nv12,yuv420p, andyuv420pis used as the default pixel format for GPU encoding