Vision Transformer with BatchNorm: Optimizing the depth

Tokens-to-Token Vision Transformers, Explained

Position Embeddings for Vision Transformers, Explained

Attention for Vision Transformers, Explained

Vision Transformers, Explained

The Rise of Vision Transformers