Javier Tirado-Garín    Javier Civera
    I3A, University of Zaragoza
 
    Camera calibration from a single perspective/edited/distorted image using a freely chosen camera model
The only requirements are Python (≥3.10) and PyTorch. The project, in development mode, can be installed with:
git clone https://github.com/javrtg/AnyCalib.git && cd AnyCalib
pip install -e .Alternatively, and optionally, a compatible version of xformers can also be installed for better efficiency by running the following instead of pip install -e .:
pip install -e .[eff]import numpy as np
import torch
from PIL import Image  # the library of choice to load images
from anycalib import AnyCalib
dev = torch.device("cuda")
# load input image and convert it to a (3, H, W) tensor with RGB values in [0, 1]
image = np.array(Image.open("path/to/image.jpg").convert("RGB"))
image = torch.tensor(image, dtype=torch.float32, device=dev).permute(2, 0, 1) / 255
# instantiate AnyCalib according to the desired model_id. Options:
# "anycalib_pinhole": model trained with *only* perspective (pinhole) images,
# "anycalib_gen": trained with perspective, distorted and strongly distorted images,
# "anycalib_dist": trained with distorted and strongly distorted images,
# "anycalib_edit": Trained on edited (stretched and cropped) perspective images.
model = AnyCalib(model_id="anycalib_pinhole").to(dev)
# Alternatively, the weights can be loaded from the huggingface hub as follows:
# NOTE: huggingface_hub (https://pypi.org/project/huggingface-hub/) needs to be installed
# model = AnyCalib().from_pretrained(model_id=<model_id>).to(dev)
# predict according to the desired camera model. Implemented camera models are detailed further below.
output = model.predict(image, cam_id="pinhole")
# output is a dictionary with the following key-value pairs:
# {
#      "intrinsics": (D,) tensor with the estimated intrinsics for the selected camera model,
#      "fov_field": (N, 2) tensor with the regressed FoV field by the network. N≈320^2 (resolution close to the one seen during training),
#      "tangent_coords": alias for "fov_field",
#      "rays": (N, 3) tensor with the corresponding (via the exponential map) ray directions in the camera frame (x right, y down, z forward),
#      "pred_size": (H, W) tuple with the image size used by the network. It can be used e.g. for resizing the FoV/ray fields to the original image size.
# }The weights of the selected model_id, if not already downloaded, will be automatically downloaded to the:
- torch hub cache directory (torch.hub.get_dir()) ifAnyCalib(model_id=<model_id>)is used, or
- huggingface cache directory if AnyCalib().from_pretrained(model_id=<model_id>)is used.
Additional configuration options are indicated in the docstring of AnyCalib:
 help(AnyCalib) 
    """AnyCalib class.
    Args for instantiation:
        model_id: one of {'anycalib_pinhole', 'anycalib_gen', 'anycalib_dist', 'anycalib_edit'}.
            Each model differes in the type of images they seen during training:
                * 'anycalib_pinhole': Perspective (pinhole) images,
                * 'anycalib_gen': General images, including perspective, distorted and
                    strongly distorted images, and
                * 'anycalib_dist': Distorted images using the Brown-Conrady camera model
                    and strongly distorted images, using the EUCM camera model,
                * 'anycalib_edit': Trained on edited (stretched and cropped) perspective
                    images.
            Default: 'anycalib_pinhole'.
        nonlin_opt_method: nonlinear optimization method: 'gauss_newton' or 'lev_mar'.
            Default: 'gauss_newton'
        nonlin_opt_conf: nonlinear optimization configuration.
            This config can be used to control the number of iterations and the space
            where the residuals are minimized. See the classes `GaussNewtonCalib` or
            `LevMarCalib` under anycalib/optim for details. Default: None.
        init_with_sac: use RANSAC instead of nonminimal fit for initializating the
            intrinsics. Default: False.
        fallback_to_sac: use RANSAC if nonminimal fit fails. Default: True.
        ransac_conf: RANSAC configuration. This config can be used to control e.g. the
            inlier threshold or the number of minimal samples to try. See the class
            `RANSAC` in anycalib/ransac.py for details. Default: None.
        rm_borders: border size of the dense FoV fields to ignore during fitting.
            Default: 0.
        sample_size: approximate number of 2D-3D correspondences to use for fitting the
            intrinsics. Negative value -> no subsampling. Default: -1.
    """AnyCalib can also be executed in batch and using possibly different camera models for each image. For example:
images = ... # (B, 3, H, W)
# NOTE: if cam_ids is a list, then len(cam_ids) must be equal to B
cam_ids = ["pinhole", "radial:1", "kb:4"]  # different camera models for each image
cam_ids = "pinhole"  # same camera model across images
output = model.predict(images, cam_id=cam_ids)
# corresponding batched output dictionary:
# {
#      "intrinsics": List[(D_i,) tensors] for each camera model "i",
#      "fov_field": (B, N, 2) tensor,
#      "tangent_coords": alias for "fov_field",
#      "rays": (B, N, 3) tensor,
#      "pred_size": (H, W).
# }- cam_idrepresents the camera model identifier(s) that can be used in the- predictmethod.
- Dcorresponds to the number of intrinsics of the camera model. It determines the length of each- intrinsicstensor in the output dictionary.
| cam_id | Description | D | Intrinsics | 
|---|---|---|---|
| pinhole | Pinhole camera model | 4 | |
| simple_pinhole | pinholewith one focal length | 3 | |
| radial:k | Radial (Brown-Conrady) [1] camera model with k | 4+ k |  | 
| simple_radial:k | radial:kwith one focal length | 3+ k |  | 
| kb:k | Kannala-Brandt [2] camera model with k | 4+ k |  | 
| simple_kb:k | kb:kwith one focal length | 3+ k |  | 
| ucm | Unified Camera Model [3] | 5 |  | 
| simple_ucm | ucmwith one focal length | 4 |  | 
| eucm | Enhanced Unified Camera Model [4] | 6 |  | 
| simple_eucm | eucmwith one focal length | 5 |  | 
| division:k | Division camera model [5] with k | 4+ k |  | 
| simple_division:k | division:kwith one focal length | 3+ k |  | 
In addition to the original works, we recommend the works of Usenko et al. [6] and Lochman et al. [7] for a comprehensive comparison of the different camera models.
The evaluation and training code is built upon the siclib library from GeoCalib, which can be installed as:
pip install -e siclibRunning the evaluation commands will write the results to outputs/results/.
Running the evaluation commands will download the dataset to data/lamar2k which will take around 400 MB of disk space.
AnyCalib trained on 
python -m siclib.eval.lamar2k_rays --conf anycalib_pretrained --tag anycalib_p --overwriteAnyCalib trained on 
python -m siclib.eval.lamar2k_rays --conf anycalib_pretrained --tag anycalib_g --overwrite model.model_id=anycalib_genRunning the evaluation commands will download the dataset to data/megadepth2k which will take around 2 GB of disk space.
AnyCalib trained on 
python -m siclib.eval.megadepth2k_rays --conf anycalib_pretrained --tag anycalib_p --overwriteAnyCalib trained on 
python -m siclib.eval.megadepth2k_rays --conf anycalib_pretrained --tag anycalib_g --overwrite model.model_id=anycalib_genRunning the evaluation commands will download the dataset to data/tartanair which will take around 1.7 GB of disk space.
AnyCalib trained on 
python -m siclib.eval.tartanair_rays --conf anycalib_pretrained --tag anycalib_p --overwriteAnyCalib trained on 
python -m siclib.eval.tartanair_rays --conf anycalib_pretrained --tag anycalib_g --overwrite model.model_id=anycalib_genRunning the evaluation commands will download the dataset to data/stanford2d3d which will take around 844 MB of disk space.
AnyCalib trained on 
python -m siclib.eval.stanford2d3d_rays --conf anycalib_pretrained --tag anycalib_p --overwriteAnyCalib trained on 
python -m siclib.eval.stanford2d3d_rays --conf anycalib_pretrained --tag anycalib_g --overwrite model.model_id=anycalib_genRunning the evaluation commands will download the dataset to data/megadepth2k-radial which will take around 1.4 GB of disk space.
AnyCalib trained on 
python -m siclib.eval.megadepth2k_radial_rays --conf anycalib_pretrained --tag anycalib_g --overwrite model.model_id=anycalib_genRunning the evaluation commands will download the dataset to data/monovo2k which will take around 445 MB of disk space.
AnyCalib trained on 
python -m siclib.eval.monovo2k_rays --conf anycalib_pretrained --tag anycalib_d --overwrite model.model_id=anycalib_dist data.cam_id=ucmAnyCalib trained on 
python -m siclib.eval.monovo2k_rays --conf anycalib_pretrained --tag anycalib_g --overwrite model.model_id=anycalib_gen data.cam_id=ucmTo comply with ScanNet++ license, we cannot directly share its data.
Please download the ScanNet++ dataset following the official instructions and indicate the path to the root of the dataset in the following evaluation command. 
This needs to be provided only the first time the evaluation is run. This first time, the command will automatically copy the evaluation images under data/scannetpp2k which will take around 760 MB of disk space.
AnyCalib trained on 
python -m siclib.eval.scannetpp2k_rays --conf anycalib_pretrained --tag anycalib_d --overwrite model.model_id=anycalib_dist scannetpp_root=<path_to_scannetpp>AnyCalib trained on 
python -m siclib.eval.scannetpp2k_rays --conf anycalib_pretrained --tag anycalib_g --overwrite model.model_id=anycalib_gen scannetpp_root=<path_to_scannetpp>Running the evaluation commands will download the dataset to data/lamar2k_edit which will take around 224 MB of disk space.
AnyCalib trained following WildCam [8] training protocol:
python -m siclib.eval.lamar2k_rays --conf anycalib_pretrained --tag anycalib_e --overwrite model.model_id=anycalib_edit eval.eval_on_edit=TrueRunning the evaluation commands will download the dataset to data/tartanair_edit which will take around 488 MB of disk space.
AnyCalib trained following WildCam [8] training protocol:
python -m siclib.eval.tartanair_rays --conf anycalib_pretrained --tag anycalib_e --overwrite model.model_id=anycalib_edit eval.eval_on_edit=TrueRunning the evaluation commands will download the dataset to data/stanford2d3d_edit which will take around 420 MB of disk space.
AnyCalib trained on 
python -m siclib.eval.stanford2d3d_rays --conf anycalib_pretrained --tag anycalib_e --overwrite model.model_id=anycalib_edit eval.eval_on_edit=TrueWe extend the OpenPano dataset from GeoCalib with panoramas that not need to be aligned with the gravity direction. This extended version consists of tonemapped panoramas from The Laval Photometric Indoor HDR Dataset, PolyHaven, HDRMaps, AmbientCG and BlenderKit.
Before sampling images from the panoramas, first download the Laval dataset following the instructions on the corresponding project page and place the panoramas in data/indoorDatasetCalibrated. Then, tonemap the HDR images using the following command:
python -m siclib.datasets.utils.tonemapping --hdr_dir data/indoorDatasetCalibrated --out_dir data/laval-tonemapTo download the rest of the panoramas and organize all the panoramas in their corresponding splits data/openpano_v2/panoramas/{split}, execute:
python -m siclib.datasets.utils.download_openpano --name openpano_v2 --laval_dir data/laval-tonemapThe panoramas from PolyHaven, HDRMaps, AmbientCG and BlenderKit can be alternatively manually downloaded from here.
Afterwards, the different training datasets mentioned in the paper: device=cuda as this significantly speeds up the creation of the datasets, but if no GPU is available, the flag can be omitted.
data/openpano_v2/openpano_v2):
python -m siclib.datasets.create_dataset_from_pano --config-name openpano_v2 device=cudadata/openpano_v2/openpano_v2_gen):
python -m siclib.datasets.create_dataset_from_pano_rays --config-name openpano_v2_gen device=cudadata/openpano_v2/openpano_v2_radial):
python -m siclib.datasets.create_dataset_from_pano_rays --config-name openpano_v2_radial device=cudadata/openpano_v2/openpano_v2_dist):
python -m siclib.datasets.create_dataset_from_pano_rays --config-name openpano_v2_dist device=cudaAs with the evaluation, the training code is built upon the siclib library from GeoCalib. Here we adapt their instructions to AnyCalib. siclib can be installed executing:
pip install -e siclibOnce (at least one of) the extended OpenPano Dataset (openpano_v2) has been downloaded and prepared, we can train AnyCalib with it.
For training with 
python -m siclib.train anycalib_op_p --conf anycalib --distributedFeel free to use any other experiment name. By default, the checkpoints will be written to outputs/training/. The default batch size is 24 which requires at least 1 NVIDIA Tesla V100 GPU with 32GB of VRAM. If only one GPU is used, the flag --distributed can be omitted. Configurations are managed by Hydra and can be overwritten from the command line.
For example, for training with 
python -m siclib.train anycalib_op_g --conf anycalib --distributed data.dataset_dir='data/openpano_v2/openpano_v2_gen'For training with 
python -m siclib.train anycalib_op_d --conf anycalib --distributed data.dataset_dir='data/openpano_v2/openpano_v2_dist'For training with 
python -m siclib.train anycalib_op_r --conf anycalib --distributed data.dataset_dir='data/openpano_v2/openpano_v2_radial'For training with 
python -m siclib.train anycalib_op_e --conf anycalib --distributed \
data.dataset_dir='data/openpano_v2/openpano_v2' \
data.im_geom_transform.change_pixel_ar=true \
data.im_geom_transform.crop=0.5 After training, the model can be evaluated using its experiment name:
python -m siclib.eval.<benchmark> --checkpoint <experiment_name> --tag <experiment_tag> --conf anycalibThanks to the authors of GeoCalib for open-sourcing the comprehensive and easy-to-use siclib which we use as the base of our evaluation and training code. 
Thanks to the authors of the The Laval Photometric Indoor HDR Dataset for allowing us to release the weights of AnyCalib under a permissive license. 
Thanks also to the authors of The Laval Photometric Indoor HDR Dataset, PolyHaven, HDRMaps, AmbientCG and BlenderKit for providing high-quality freely-available panoramas that made the training of AnyCalib possible.
If you use any ideas from the paper or code from this repo, please consider citing:
@InProceedings{tirado2025anycalib,
  author={Javier Tirado-Gar{\'\i}n and Javier Civera},
  title={{AnyCalib: On-Manifold Learning for Model-Agnostic Single-View Camera Calibration}},
  booktitle={ICCV},
  year={2025}
}Code and weights are provided under the Apache 2.0 license.
[1] Close-Range Camera Calibration. D.C. Brown, 1971.
[2] A Generic Camera Model and Calibration Method for Conventional, Wide-Angle, and Fish-Eye Lenses. J. Kannala, S.S. Brandt, TPAMI 2006.
[3] Single View Point Omnidirectional Camera Calibration from Planar Grids. C. Mei, P. Rives, ICRA, 2007.
[4] An Enhanced Unified Camera Model. B. Khomutenko, at al., IEEE RA-L, 2016.
[5] Simultaneous Linear Estimation of Multiple View Geometry and Lens Distortion. A.W. Fitzgibbon, CVPR, 2001.
[6] The Double Sphere Camera Model. V. Usenko, et al., 3DV, 2018.
[7] BabelCalib: A Universal Approach to Calibrating Central Cameras. Y. Lochman, et al., ICCV, 2021.
[8] Tame a Wild Camera: In-the-Wild Monocular Camera Calibration. S. Zhu, et al., NeurIPS, 2023.