-
Notifications
You must be signed in to change notification settings - Fork 941
Open
Description
🐛 Bug
Following the instructions in #1144 by @Ryan-Qiyu-Jiang, I attempted to run the command to fine tune UNITER on VQA2, but it appears to be missing the necessary config.
Also, the uniter.pretrained.tar.gz file appears to contain an invalid folder path, but I was able to resolve this manually (described below).
Command
mmf_run config=projects/uniter/configs/vqa2/defaults.yaml run_type=train_val dataset=vqa2 model=uniter checkpoint.resume_zoo=uniter.pretrained
To Reproduce
Steps to reproduce the behavior:
mmf_run config=projects/uniter/configs/vqa2/defaults.yaml run_type=train_val dataset=vqa2 model=uniter checkpoint.resume_zoo=uniter.pretrained- Fix the tarball folder path:
- If you don't do this, you will get an error like
AssertionError: None or multiple checkpoints files. MMF doesn't know what to do., as it can't find the checkpoint file in the nested folder mv ~/.cache/torch/mmf/data/models/uniter.pretrained/private/home/ryanjiang/winoground/pretrained_models/uniter_pretrained_mmf.pth ~/.cache/torch/mmf/data/models/uniter.pretrained/
- If you don't do this, you will get an error like
- Run the same command again from 1.
File "/home/maxsparrow/.pyenv/versions/miniconda3-4.7.12/envs/mmf3/bin/mmf_run", line 33, in <module>
sys.exit(load_entry_point('mmf', 'console_scripts', 'mmf_run')())
File "/home/maxsparrow/code/CS7643/mmf/mmf_cli/run.py", line 133, in run
main(configuration, predict=predict)
File "/home/maxsparrow/code/CS7643/mmf/mmf_cli/run.py", line 52, in main
trainer.load()
File "/home/maxsparrow/code/CS7643/mmf/mmf/trainers/mmf_trainer.py", line 46, in load
self.on_init_start()
File "/home/maxsparrow/code/CS7643/mmf/mmf/trainers/core/callback_hook.py", line 20, in on_init_start
callback.on_init_start(**kwargs)
File "/home/maxsparrow/code/CS7643/mmf/mmf/trainers/callbacks/checkpoint.py", line 30, in on_init_start
self._checkpoint.load_state_dict()
File "/home/maxsparrow/code/CS7643/mmf/mmf/utils/checkpoint.py", line 244, in load_state_dict
load_pretrained=ckpt_config.resume_pretrained,
File "/home/maxsparrow/code/CS7643/mmf/mmf/utils/checkpoint.py", line 268, in _load
ckpt, should_continue = self._load_from_zoo(file)
File "/home/maxsparrow/code/CS7643/mmf/mmf/utils/checkpoint.py", line 448, in _load_from_zoo
zoo_ckpt = load_pretrained_model(file)
File "/home/maxsparrow/code/CS7643/mmf/mmf/utils/checkpoint.py", line 163, in load_pretrained_model
return _load_pretrained_model(model_name_or_path_or_checkpoint, args, kwargs)
File "/home/maxsparrow/code/CS7643/mmf/mmf/utils/checkpoint.py", line 140, in _load_pretrained_model
config = get_config_from_folder_or_ckpt(download_path, ckpt)
File "/home/maxsparrow/code/CS7643/mmf/mmf/utils/checkpoint.py", line 90, in get_config_from_folder_or_ckpt
"No configs provided with pretrained model"
AssertionError: No configs provided with pretrained model while checkpoint also doesn't have configuration.
Expected behavior
Expect it to train using pretrained UNITER checkpoint with resume_zoo. Other pretrained resume_zoo checkpoints have a config.yaml file as part of the downloaded tarball, but this one does not.
Metadata
Metadata
Assignees
Labels
No labels