This repo contains code and weights for SPoT: Subpixel Placement of Tokens, accepted for ECLR, ICCVW 2025.
For an introduction to our work, visit the project webpage.
The package can currently be installed via:
# HTTPS
pip install git+https://github.com/dsb-ifi/SPoT.git
# SSH
pip install git+ssh://[email protected]/dsb-ifi/SPoT.gitTo load the model, first download the checkpoints from Google Drive.
Then extract the checkpoints into a folder named checkpoints/ in the repo.
The model can be loaded easily by
from spot.load_models import *
model_name = 'spot_mae_b16'
assert model_name in valid_models
model = load_trained_model(
model_name=model_name,
sampler='grid_center', # Spatial prior
ksize=16, # Window size
n_features=25, # Number of tokens
)
We provide a Jupyter notebook that illustrates loading, evaluating, and extracting token placements for the models.
If you find our work useful, please consider citing our paper.
@inproceedings{hjelkremtan2025spot,
title={{SPoT}: Subpixel Placement of Tokens in Vision Transformers},
author={Hjelkrem-Tan, Martine and Aasan, Marius and Arteaga, Gabriel Y. and Ram\'irez Rivera, Ad\'in},
journal={{CVF/ICCV} Efficient Computing under Limited Resources: Visual Computing ({ECLR} {ICCVW})},
year={2025}
}

