Skip to content

Conversation

@yuya-haruna
Copy link

@yuya-haruna yuya-haruna commented Sep 9, 2025

What this does

Delete the empty images dir after converting the video on save_episode().
Use VideoEncodingManager for delete images dir.

How it was tested

Added test_add_frame_video in tests/datasets/test_datasets.py.

  • Test mp4 file created.
  • Test images dir deleted.

How to checkout & try? (for the reviewer)

$ pytest -sv tests/datasets/test_datasets.py -k test_add_frame_video
================================================================== test session starts ==================================================================
platform darwin -- Python 3.11.8, pytest-8.4.2, pluggy-1.6.0
rootdir: /Users/yuya.haruna/project/yuya.haruna/lerobot
configfile: pyproject.toml
plugins: timeout-2.4.0, cov-6.3.0, mock-serial-0.0.1
collected 52 items / 51 deselected / 1 selected                                                                                                         

Testing with DEVICE='cpu'

tests/datasets/test_datasets.py .                                                                                                                 [100%]

=========================================================== 1 passed, 51 deselected in 3.27s ============================================================

Copilot AI review requested due to automatic review settings September 9, 2025 06:39
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds automatic cleanup of empty image directories after video encoding in the VideoEncodingManager. The change ensures that temporary image directories created during video encoding are removed once the video conversion is complete.

Key changes:

  • Wraps video encoding operations with VideoEncodingManager context manager to handle cleanup
  • Adds test coverage for video dataset creation and verification of cleanup behavior

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
src/lerobot/datasets/lerobot_dataset.py Wraps video encoding calls with VideoEncodingManager context manager for automatic cleanup
tests/datasets/test_datasets.py Adds video dataset fixture and test to verify video creation and image directory cleanup

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment on lines 867 to 868
with VideoEncodingManager(self):
self.encode_episode_videos(episode_index)
Copy link

Copilot AI Sep 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The VideoEncodingManager context manager wraps only the encode_episode_videos call. Consider whether other operations within this conditional block should also be included in the context manager scope for consistency.

Copilot uses AI. Check for mistakes.
Comment on lines 877 to 882
logging.info(
f"Batch encoding {self.batch_encoding_size} videos for episodes {start_ep} to {end_ep - 1}"
)
self.batch_encode_videos(start_ep, end_ep)
with VideoEncodingManager(self):
Copy link

Copilot AI Sep 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Similar to the single episode encoding, consider whether the logging statement on line 879-881 should be included within the VideoEncodingManager context for consistency with the overall batch operation.

Suggested change
logging.info(
f"Batch encoding {self.batch_encoding_size} videos for episodes {start_ep} to {end_ep - 1}"
)
self.batch_encode_videos(start_ep, end_ep)
with VideoEncodingManager(self):
with VideoEncodingManager(self):
logging.info(
f"Batch encoding {self.batch_encoding_size} videos for episodes {start_ep} to {end_ep - 1}"
)

Copilot uses AI. Check for mistakes.
@imstevenpmwork imstevenpmwork added enhancement Suggestions for new features or improvements dataset Issues regarding data inputs, processing, or datasets labels Oct 17, 2025
@imstevenpmwork imstevenpmwork self-assigned this Oct 17, 2025
@CarolinePascal
Copy link
Collaborator

Hi @yuya-haruna,

Thank you for your contribution ! If I understand correctly, you suggest to delete the temporary images folder at the end of each episode or batch of episodes, right ? If that's indeed the case, that would mean that the folder has to be re-created at the beginning of the next episode/batch of episodes, which looks a bit troublesome.

Let me know if there is something I missed !

Best,

Caroline,

@yuya-haruna
Copy link
Author

Hi, @CarolinePascal
Thanks for the feedback!

Yes, VideoEncodingManager.__exit__ now wipes the empty images directory.
However, the next episode doesn’t require any manual setup — LeRobotDataset.add_frame() calls img_path.parent.mkdir(parents=True, exist_ok=True) (code) when writing the first frame, so the directory is automatically recreated.

I understand this doesn’t really make things any less troublesome.
If I’ve misunderstood your question, please let me know.

@CarolinePascal
Copy link
Collaborator

Hi @yuya-haruna,

The image folder is indeed "created" at the beginning of each episode, but that does not mean that we have to delete it at the end of each episode. It should rather be seen as a safety measure in case this folder is mistakenly deleted between episodes.
Also, in practice,mkdir will not delete the directory and recreate it if it already exists, so it will be (a bit) faster.

Eventhough deleting the image folder as you suggest would not impact the recording process, it will 1) make the code more bloated and 2) somehow increase the recording overhead as the folder is deleted and recreated at each new episode.

Unless there is something else I didn't understand (please let me know !), I suggest we close this PR.

Best,

Caroline.

@yuya-haruna
Copy link
Author

Hi, @CarolinePascal

It should rather be seen as a safety measure in case this folder is mistakenly deleted between episodes.

The images directory deleted by VideoEncodingManager is only recognized when it is empty.
Therefore, I believe there is no need to worry about accidentally deleting a directory that still contains images.

I created this PR because it seemed unnatural that the images directory remained empty after conversion, even though all *.jpg files within it had been deleted.

While minor, the increased cost of directory recreation is indeed as you pointed out.
Please close this if unnecessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dataset Issues regarding data inputs, processing, or datasets enhancement Suggestions for new features or improvements

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants