Skip to content

Conversation

@Dan-Flores
Copy link
Contributor

This PR resolves these TODOs:
// TODO-VideoEncoder: Enable configuration of color properties, similar to FFmpeg.
// TODO-VideoEncoder: Enable user set pixel formats to be set and handled with the appropriate NPP function

To address the first TODO, it adds support for the following parameters:
Color spaces: BT.601, BT.709, BT.2020
Color ranges: tv (limited), pc (full)

  • To my understanding, these are the most commonly used or newer color spaces parameters (docs for AVColorSpace)
  • As a result, 6 ColorConversionMatrices are stored and utilized for NPP color conversion functions.

To address the second TODO, the correct NPP functions are called to support the pixel formats nv12, yuv420p, and yuv420p is used as the default pixel format for GPU encoding

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Dec 11, 2025
{-0.148f, -0.291f, 0.439f, 128.0f},
// V = 0.439*R - 0.368*G - 0.071*B + 128 (BT.601 coefficients)
{0.439f, -0.368f, -0.071f, 128.0f}};
// RGB to YUV conversion matrices to use in NPP color conversion functions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you share how these were derived? What were the original values that were used as reference?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These follow the pattern described in the note in CudaCommon, I can add a comment referencing that note here

// Color space and color range
// ---------------------------

Copy link
Contributor

@NicolasHug NicolasHug Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The note is about YUV -> RGB so it's not 100% targeted to what the matrices are doing. But yes, add a ref to that note, it's still useful.

You asked me offline whether we should update the note to explain limited range: yes, we should :)
There's a TODO in the note for that, but I never had the chance to do it - and frankly I forgot the underlying logic lol. If you'd like to give it a go, please do it - in a follow-up PR.

Comment on lines 1463 to 1464
assert encoder_metadata["color_range"] == ffmpeg_metadata["color_range"]
assert encoder_metadata["color_space"] == ffmpeg_metadata["color_space"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll want to be stricter here:

Suggested change
assert encoder_metadata["color_range"] == ffmpeg_metadata["color_range"]
assert encoder_metadata["color_space"] == ffmpeg_metadata["color_space"]
assert encoder_metadata["color_range"] == ffmpeg_metadata["color_range"] == color_range
assert encoder_metadata["color_space"] == ffmpeg_metadata["color_space"] == color_space

Comment on lines -1437 to -1438
assert encoder_metadata["pix_fmt"] == "yuv420p"
assert ffmpeg_metadata["pix_fmt"] == "yuv420p"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we're not assert pix_fmt anymore, which makes it hard to verify that the changes in this PR are correct. IIRC, passing NV12 actually resulted in a yuv420p format at the end. We should try to undertand why that was the case. It may not add a lot of value to support both nv12 and yuv420p as parameter values if they're both the same thing (and if they both end-up being yuv420 anyway).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect the format changes occur based on codec implementation. By adding back this assertion, I observed a deprecated pixel format yuvj420p is set when pc (full) color range is used by h264_nvenc and hevc_nvenc, but not av1_nvenc.
I can incorporate pixel formats into my benchmarking PR, to see if there is some optimization to using nv12.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new assertions you added are good. But I personally still do not understand why passing NV12 actually ends up being reported yuv420.
Is it actually still NV12, and it's just FFmpeg that can't tell the difference? Or is it indeed yuv420?

Until we get a good understanding on that, I think we should refrain from allowing pixel_format with CUDA encoding. We have a surprising behavior that we cannot explain right now: passing NV12 leads to yuv420. If we're surprised, our users will be surprised too, and we won't have a good explanation to give them. It is safer to simply not expose this functionality just yet, and let them rely on the default behavior.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants