Skip to content

Conversation

scotts
Copy link
Contributor

@scotts scotts commented Sep 10, 2025

Refactors pybind_ops.cpp so that it does not know about SingleStreamDecoder at all, it only knows about AVIOFileLikeContext. It now has a single function:

int64_t create_file_like_context(py::object file_like, bool is_for_writing);

Given a Python file-like object, it creates a AVIOFileLikeContext from it. We return a pointer to that object as a Python int. This means that all decoder creation is now in custom_ops.cpp. They accept the int and cast it back into a AVIOFileLikeContext pointer.

Upsides:

  1. pybind_ops.cpp is now much simpler, and it's clear that it is only concerned with creating the context for file-like contexts.
  2. All decoder creation is in one C++ file.
  3. The Python side no longer has to do a conversion from an int into a tensor. (We did this because decoder creation in pybind_ops.cpp returned an int.)
  4. All of the restrictions we had on encode_audio_to_file_like() are now gone. We can pass the samples as a proper tensor.

Downsides:

  1. Some decoder creation in custom_ops.cpp now has to accept an int from the Python side and do a cast.
  2. We need to explicitly make the visibility of AVIOFileContextHolder hidden in order to make visibility across both pybind_ops.cpp and custom_ops.cpp the same. I only roughly understand this part.

I think that the upsides of this approach outweigh the downsides of how we were doing it before.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Sep 10, 2025
@scotts scotts marked this pull request as ready for review September 10, 2025 16:59
Copy link
Contributor

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @scotts , left 2 areas to potentially investigate but that can be left as follow-ups. I agree this is a net improvement.

Comment on lines 17 to 19
// pybind_ops.cpp with the global visibility flag as required. On the
// custom_ops.cpp side we don't need to do that. However, we do need to ensure
// that this class is also seen as the same visibility.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine, but if we can, we might as well set the hidden visibility flag for the whole pybind_ops lib and remove the VISIBILITY_HIDDEN stuff:

        target_compile_options(
            ${custom_ops_library_name}
            PUBLIC
            "-fvisibility=hidden"
        )

This seems to work OK locally

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's what I did, I can confirm this compiles locally

diff --git a/src/torchcodec/_core/AVIOFileLikeContext.h b/src/torchcodec/_core/AVIOFileLikeContext.h
index 3146c00..fd7f534 100644
--- a/src/torchcodec/_core/AVIOFileLikeContext.h
+++ b/src/torchcodec/_core/AVIOFileLikeContext.h
@@ -11,27 +11,13 @@
 
 #include "src/torchcodec/_core/AVIOContextHolder.h"
 
-// pybind11 has specific visibility rules; see:
-//  https://pybind11.readthedocs.io/en/stable/faq.html#someclass-declared-with-greater-visibility-than-the-type-of-its-field-someclass-member-wattributes
-// This file is included in both pybind_ops.cpp and custom_ops.cpp. We compile
-// pybind_ops.cpp with the global visibility flag as required. On the
-// custom_ops.cpp side we don't need to do that. However, we do need to ensure
-// that this class is also seen as the same visibility.
-//
-// This only matters on Linux and Mac; on Windows we don't need to do anything.
-#ifdef _WIN32
-#define VISIBILITY_HIDDEN
-#else
-#define VISIBILITY_HIDDEN __attribute__((visibility("hidden")))
-#endif
-
 namespace py = pybind11;
 
 namespace facebook::torchcodec {
 
 // Enables uers to pass in a Python file-like object. We then forward all read
 // and seek calls back up to the methods on the Python object.
-class VISIBILITY_HIDDEN AVIOFileLikeContext : public AVIOContextHolder {
+class AVIOFileLikeContext : public AVIOContextHolder {
  public:
   explicit AVIOFileLikeContext(const py::object& fileLike, bool isForWriting);
 
diff --git a/src/torchcodec/_core/CMakeLists.txt b/src/torchcodec/_core/CMakeLists.txt
index 03f68f6..c45a614 100644
--- a/src/torchcodec/_core/CMakeLists.txt
+++ b/src/torchcodec/_core/CMakeLists.txt
@@ -137,6 +137,14 @@ function(make_torchcodec_libraries
         "${custom_ops_dependencies}"
     )
 
+    if(NOT WIN32)
+        target_compile_options(
+            ${custom_ops_library_name}
+            PUBLIC
+            "-fvisibility=hidden"
+        )
+    endif()
+
     # 3. Create libtorchcodec_pybind_opsN.so.
     set(pybind_ops_library_name "libtorchcodec_pybind_ops${ffmpeg_major_version}")
     set(pybind_ops_sources

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@NicolasHug, yeah, I did the same. The one difference is that I put both of them under the same if(NOT WIN32) and explaining comment. It breaks the convention of only doing one library at a time, but I think it helps understanding. All of the visibility stuff, with the same cause, are all in one place.

std::move(avioContextHolder),
audioStreamOptions);
encoder.encode();
int64_t create_file_like_context(py::object file_like, bool is_for_writing) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment above was meant to be about "laundering into into tensors" because it was about the decoder object itself. Now we're laundering an int into an AVIOFileLikeContext object IIUC, not a tensor, so the comment may need to be updated.

Separately from the comment, and since we shouldn't need the intermediate tensor representation, I wonder if we could actually work around all that by "cleanly" binding the AVIOFileLikeContext to Python and just pass normal py::objects? I thought the whole laundering problem was due to a problem between pybind and torch tensors, but IIUC we don't need the tensor part at all anymore.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe py::objects need to be existing kinds of objects Python already knows about. I thought about what you're suggesting, and I think that would mean actually defining a new kind of Python to C++ object through pybind. We can do it, but it will take some work to figure out if it's worth it.

Copy link
Contributor Author

@scotts scotts Sep 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I improved the comment, referenced and created #896

@scotts scotts merged commit ab026c8 into meta-pytorch:main Sep 12, 2025
47 checks passed
@scotts scotts deleted the refactor_file_like branch September 12, 2025 20:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants