This video walks through the Keras Code Example implementation of Vision Transformers!! I see this as a huge opportunity for graduate students and researchers because this architecture has a serious room for improvement. I predict that Attention will outperform CNN models like ResNets, EfficientNets, etc. it will just take the discovery of complimentary priors, e.g. custom data augmentations or pre-training tasks. I hope you find this video useful, please check out the rest of the Keras Code Examples playlist!
Content Links:
Keras Code Exampes - Vision Transformers: https://keras.io/examples/vision/imag...
Google AI Blog Visualization: https://ai.googleblog.com/2020/12/tra...
Formal Paper describing this model: https://arxiv.org/pdf/2010.11929.pdf
TensorFlow Addons: https://www.tensorflow.org/addons
TensorFlow Addons -AdamW: https://www.tensorflow.org/addons/api...
Chapters
0:00 Welcome to the Keras Code Examples!
0:45 Vision Transformer Explained
2:47 TensorFlow Add-Ons
3:29 Hyperparameters
7:04 Data Augmentations
8:30 Patch Construction
11:52 Patch Embeddings
14:01 ViT Classifier
16:30 Compile and Run
19:02 Analysis of Final Performance