General changes and adding of support for more functionality #7

arkel23 · 2020-12-22T12:10:51Z

Added support for 'H-14' and L'16' ViT models.
Added support for downloading the models directly from Google's cloud storage.
Corrected the Jax to Pytorch weights transformation. Previous methodology would lead to .pth state_dict files without the 'representation layer'. ViT('load_repr_layer'=True) would lead to an error. If only interested in inference the representation layer was unnecessary as discussed in the original paper for the Vision Transformer, but for other applications and experiments it may be useful so I added a download_convert_models.py to first download the required models, convert them with all the weights, and then you can completely tune the parameters.
Added support for visualizing attention, by returning the scores values in the multi-head self-attention layers. The visualizing script was mostly taken from jeonsworld/ViT-pytorch repository.
Added examples for inference (single image), and fine-tuning/training (using CIFAR-10).

…s for the conversion, added download links to download.sh and configs.py for models that were missing

… beforehand it directly downloads them to torchhub and then converts them on the fly

…ure that it loads the representation layer

…ined_vit into utils.py

…to return head scores if given parameter visualize=True is given, otherwise functionality stays the same

…y, also added an example with cifar-10. changed the loading logic to allow for appropriate loading of all layers regardless of if loading fc layers with different number of classes and/or representation layer. also verified that they load properly

* Forked from [Luke Melas-Kyriazi repository](https://github.com/lukemelas/PyTorch-Pretrained-ViT). * Added support for 'H-14' and L'16' ViT models. * Added support for downloading the models directly from Google's cloud storage. * Corrected the Jax to Pytorch weights transformation. Previous methodology would lead to .pth state_dict files without the 'representation layer'. `ViT('load_repr_layer'=True)` would lead to an error. If only interested in inference the representation layer was unnecessary as discussed in the original paper for the Vision Transformer, but for other applications and experiments it may be useful so I added a `download_convert_models.py` to first download the required models, convert them with all the weights, and then you can completely tune the parameters. * Added support for visualizing attention, by returning the scores values in the multi-head self-attention layers. The visualizing script was mostly taken from [jeonsworld/ViT-pytorch repository](https://github.com/jeonsworld/ViT-pytorch). * Added examples for inference (single image), and fine-tuning/training (using CIFAR-10).

lukemelas · 2020-12-23T23:41:16Z

Thank you! This PR looks brilliant. I am excited to review it and merge it -- it might take a bit longer than usual due to the holidays, but I'll get to it soon.

…

…net21k and not finetuned

…nd dictionaries

…processed tokens directly

…onfiguration dictionary

…o class head

huananerban · 2024-06-26T13:22:03Z

@arkel23 I would like to ask you why my L-16 pre-trained model still can't be trained, I get an error"Missing keys when loading pretrained weights: []"

arkel23 and others added 13 commits December 18, 2020 00:32

changed convert.py, added explore-conversion_21k.py script, added log…

0a14ca6

…s for the conversion, added download links to download.sh and configs.py for models that were missing

restructured directory and made it so that instead of downloading pth…

fc6722f

… beforehand it directly downloads them to torchhub and then converts them on the fly

restructured, deleted jax_to_pytorch and moved to utils.py and made s…

4d31187

…ure that it loads the representation layer

deleted jax_to_pytorch and added the py to download the models

547f8c9

deleted jax_to_pytorch and combined relevant files into pytorc_pretra…

0598f15

…ined_vit into utils.py

added some inference scripts and some annotations in transformer.py

69c2138

added an example for cifar-10 dataset

0f3ab14

added files and example to visualize attention, modified transformer …

eb77c3f

…to return head scores if given parameter visualize=True is given, otherwise functionality stays the same

Update README.md

b1a543d

Update README.md

5c1a017

Update README.md

e0405c1

arkel23 and others added 16 commits May 19, 2021 13:23

changed downloaded models to only standalone vits pretrained on image…

23160eb

…net21k and not finetuned

changes to load pretrained weights and load from configuration yaml a…

5b493f9

…nd dictionaries

modified readme to describe how to load partial

6d08331

added load_fc_layer as argument in case want to retrieve transformer …

0c734e8

…processed tokens directly

directly use config so changes in seq len are reflected in original c…

9670b61

…onfiguration dictionary

readability and removed unused variables

b5a5bcb

add functions to retrieve patchified image before inputting into vit

c381af9

changes to retrieve intermediate representations

f51f41d

added text modality from vilt

af11a2f

added vocab size to config

542e480

added default vocab size

3fb71c9

reorganized and separated extract features into a function

4ea0abd

updated structure to have less ifs

b0b5b14

added options to configuration dic

b29c2cb

put all back into forward mode

54b4119

updated small models

7557f42

arkel23 and others added 8 commits September 1, 2021 03:22

updated configs

9e731ac

added pretrained ckpts from google for s16, s32, and ti16

9322eee

added patch 4 and 8 configs

eb739d0

added option to not load cls token

fd2f4ee

typo

ace1193

layernorm regardless of fc or not

a0c9e76

option to turn off layernorm before head, and pass only first token t…

09a598c

…o class head

updated names and default configs for vits

5f08020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

General changes and adding of support for more functionality #7

General changes and adding of support for more functionality #7

Uh oh!

arkel23 commented Dec 22, 2020

Uh oh!

lukemelas commented Dec 23, 2020 via email

Uh oh!

huananerban commented Jun 26, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

General changes and adding of support for more functionality #7

Are you sure you want to change the base?

General changes and adding of support for more functionality #7

Uh oh!

Conversation

arkel23 commented Dec 22, 2020

Uh oh!

lukemelas commented Dec 23, 2020 via email

Uh oh!

huananerban commented Jun 26, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants