This project demonstrates how to estimate depth maps from 2D RGB images and transform them into 3D point clouds using modern deep learning models. It leverages pretrained models for monocular depth estimation (e.g., MiDaS, DPT, Depth-Anything) to infer spatial geometry from single images.
The pipeline takes a 2D image as input and performs the following steps:
- Depth Estimation — A pretrained deep learning model predicts a dense depth map.
- Depth Visualization — The predicted depth is normalized and colorized for analysis.
- 3D Point Cloud Generation — The depth map is projected into 3D space to reconstruct the scene geometry.
- (Optional) Mesh reconstruction and visualization with Open3D.
As illustrated the deep learning model predicts a dense depth map okay.
Left: Original RGB image
Right: Estimated depth map
Transformed into a 3D pointclod:
Left: Original point cloud
Right: Colored point cloud
utils.py: holds subfunctions.computer_vision.py: holds functions for transforming from 2D to 3D.main.py: holds the pipeline.



