Demystifying LLMs: From Pretraining to Adaptation

Presented at UWORCS 2025, Western University.

This talk provides an accessible end-to-end walkthrough of modern large language models—from raw text pretraining to practical deployment. The goal is to demystify the engineering and research decisions behind models like GPT and LLaMA.

Topics Covered

Tokenization & vocabulary design — BPE, SentencePiece, and the trade-offs involved
Pretraining objectives — next-token prediction, masked language modeling, and why they work
Scaling laws — the Chinchilla compute-optimal regime and what it means for model design
Instruction tuning — supervised fine-tuning (SFT) on curated demonstrations
Alignment via RLHF — reward modeling, PPO, and DPO as a simpler alternative
Practical considerations — mixed precision, gradient checkpointing, and distributed training

Slides