-
Notifications
You must be signed in to change notification settings - Fork 127
General changes and adding of support for more functionality #7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
arkel23
wants to merge
37
commits into
lukemelas:master
Choose a base branch
from
arkel23:master
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…s for the conversion, added download links to download.sh and configs.py for models that were missing
… beforehand it directly downloads them to torchhub and then converts them on the fly
…ure that it loads the representation layer
…ined_vit into utils.py
…to return head scores if given parameter visualize=True is given, otherwise functionality stays the same
…y, also added an example with cifar-10. changed the loading logic to allow for appropriate loading of all layers regardless of if loading fc layers with different number of classes and/or representation layer. also verified that they load properly
* Forked from [Luke Melas-Kyriazi repository](https://github.com/lukemelas/PyTorch-Pretrained-ViT). * Added support for 'H-14' and L'16' ViT models. * Added support for downloading the models directly from Google's cloud storage. * Corrected the Jax to Pytorch weights transformation. Previous methodology would lead to .pth state_dict files without the 'representation layer'. `ViT('load_repr_layer'=True)` would lead to an error. If only interested in inference the representation layer was unnecessary as discussed in the original paper for the Vision Transformer, but for other applications and experiments it may be useful so I added a `download_convert_models.py` to first download the required models, convert them with all the weights, and then you can completely tune the parameters. * Added support for visualizing attention, by returning the scores values in the multi-head self-attention layers. The visualizing script was mostly taken from [jeonsworld/ViT-pytorch repository](https://github.com/jeonsworld/ViT-pytorch). * Added examples for inference (single image), and fine-tuning/training (using CIFAR-10).
…net21k and not finetuned
…processed tokens directly
…onfiguration dictionary
@arkel23 I would like to ask you why my L-16 pre-trained model still can't be trained, I get an error"Missing keys when loading pretrained weights: []" |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Added support for 'H-14' and L'16' ViT models.
Added support for downloading the models directly from Google's cloud storage.
Corrected the Jax to Pytorch weights transformation. Previous methodology would lead to .pth state_dict files without the 'representation layer'. ViT('load_repr_layer'=True) would lead to an error. If only interested in inference the representation layer was unnecessary as discussed in the original paper for the Vision Transformer, but for other applications and experiments it may be useful so I added a download_convert_models.py to first download the required models, convert them with all the weights, and then you can completely tune the parameters.
Added support for visualizing attention, by returning the scores values in the multi-head self-attention layers. The visualizing script was mostly taken from jeonsworld/ViT-pytorch repository.
Added examples for inference (single image), and fine-tuning/training (using CIFAR-10).