Large Language Models, StructBERT — Incorporating Language Structures into Pretraining

An Introduction To Deep Learning For Sequential Data

FastSpeech: Paper Overview & Implementation

Towards Generative AI for Model Architecture

Implementing a Transformer Encoder from Scratch with JAX and Haiku

Large Language Models, ALBERT — A Lite BERT for Self-supervised Learning

Cracking the Code LLMs

The A-Z of Transformers: Everything You Need to Know

Topological Generalisation with Advective Diffusion Transformers

Transforming text into vectors: TSDAE’s unsupervised approach to enhanced embeddings