Tokens-to-Token Vision Transformers, Explained

Position Embeddings for Vision Transformers, Explained