Understanding Transformers

CLIP, LLaVA, and the Brain

From Masked Image Modeling to Autoregressive Image Modeling

Transformers Can Now Work Pixel by Pixel, Says Meta AI’s New Study

Flash attention(Fast and Memory-Efficient Exact Attention with IO-Awareness): A deep dive

A Comprehensive Guide on i-Transformer

Mastering Decoder-Only Transformer: A Comprehensive Guide

Understanding Transformers: A Deep Dive into NLP’s Core Technology

Multimodal Large Language Models & Apple’s MM1

Deep Dive into Transformers by Hand ✍︎