Principles of AI: LLMs (UPenn, Stat 9911, Spring 2025)

This course explores Large Language Models (LLMs), from the basics to cutting-edge research.

Reference Materials

Course Syllabus.
Lecture Notes; Work in progress.

Lectures

Link	Topic
01	Motivation and Context
02	AI: Goals and definitions. The role of LLMs.
03	LLM architectures: attention and transformers.
04	Insight into transformer architectures.
05	Position encoding.
06	Specific LLM families: GPT, Llama, DeepSeek, LLM360.
07	Training LLMs: pre- and post-training, supervised fine-tuning, learning from preferences (PPO, DPO, GRPO).
08	Test-time computation: sampling, prompting, reasoning.
09	Empirical Behaviors: scaling laws, emergence, memorization, super-phenomena.

Student Presentations

Calibrated Language Models Must Hallucinate by Georgy Noarov.
Representations in Deep Neural Networks by Joseph H. Rudoler.
First-Person Fairness in Chatbots by Varun Gupta.
LLMs Know More Than They Show: On the Intrinsic Representation of LLM Hallucinations by Ryan Chan.
Representational Abilities of Transformers by Soham Mallick and Manit Paul.
AI Control: Protocols and methods for deploying untrusted AI models by Davis Brown.
Diffusion LLMs by Zhihan Huang and Kevin Jiang.
Various forms of preference optimization by Tao Wang and Sunay Joshi.
Adversarial Reasoning in LLMs by Mahdi Sabbaghi.
Transformer Circuits: Mathematical Framework and In-context Learning by Hwai-Liang Tung & Yu Huang.
Model Collapse by Xuyang Chen & Xianglong Hou.

Additional Resources

Links to other courses

Foundations of Large Language Models, U of Michigan, 2024
Language Modeling from Scratch, Stanford, Spring 2024
Recent Advances on Foundation Models, U of Waterloo, Winter 2024
Large Models, U of Toronto, Winter 2025
Advanced NLP, CMU, Spring 2025

Videos

Andrej Karpathy's Neural Networks: Zero to Hero video lectures. 100% coding-based, hands-on tutorial on implementing basic autodiff, neural nets, language models, and GPT-2 mini (124M params).

Key papers

The Llama 3 Herd of Models describes the Llama "open-weights LLM" developed by Meta. Possibly the highest information content anywhere about LLMs.
DeepSeek-V3 Technical Report and DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning describe the open-weights DeepSeek V3 and R1 models, which bring together several innovations in training LLMs to make them achieve comparable performance to some top closed models.

Tutorials, books and book chapters

The corresponding sections in the Understanding Deep Learning book. See also the associated tutorial posts: LLMs; Transformers 1, 2, 3; Training and fine-tuning; Inference
Foundations of Large Language Models book

Workshops, conferences

NeurIPS, ICML, ICLR

Name		Name	Last commit message	Last commit date
Latest commit History 86 Commits
Extra Notes		Extra Notes
Lec		Lec
Pres		Pres
LICENSE		LICENSE
README.md		README.md
syllabus.pdf		syllabus.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Principles of AI: LLMs (UPenn, Stat 9911, Spring 2025)

Reference Materials

Lectures

Student Presentations

Additional Resources

Links to other courses

Videos

Key papers

Tutorials, books and book chapters

Workshops, conferences

About

Releases

Packages

License

dobriban/Principles-of-AI-LLMs

Folders and files

Latest commit

History

Repository files navigation

Principles of AI: LLMs (UPenn, Stat 9911, Spring 2025)

Reference Materials

Lectures

Student Presentations

Additional Resources

Links to other courses

Videos

Key papers

Tutorials, books and book chapters

Workshops, conferences

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Packages