Transformers v4.6.0 release is the first Computer Vision dedicated release.<p>- CLIP from OpenAI, Image-Text similarity or Zero-Shot Image classification<p>- ViT from GoogleAI<p>- DeiT from facebookai, SOTA Image Classification<p>And you can try ViT/DeiT on the hub: <a href="https://huggingface.co/google/vit-base-patch16-224" rel="nofollow">https://huggingface.co/google/vit-base-patch16-224</a>