Stanford CME295 Transformers & LLMs | Autumn 2025 | Lecture 9 - Recap & Current Trends
Stanford CME295 Transformers & LLMs | Autumn 2025 | Lecture 9 - Recap & Current Trends
For more information about Stanford’s graduate programs, visit: https://online.stanford.edu/graduate-education
December 5, 2025
This lecture covers:
• Recap
• Trending topics
• Closing thoughts
To follow along with the course schedule and syllabus, visit: https://cme295.stanford.edu/syllabus/
Chapters:
00:00:00 Introduction
00:01:12 Transformer
00:06:35 Transformer-based models & tricks
00:11:17 Large Language Models
00:15:05 LLM training
00:24:09 LLM tuning
00:29:41 LLM reasoning
00:38:37 Agentic LLMs (RAG, tool calling)
00:44:09 LLM evaluation
00:48:57 Vision Transformer
01:04:02 Diffusion-based LLMs
01:23:38 Closing thoughts
01:50:16 Thank you!
Afshine Amidi is an Adjunct Lecturer at Stanford University.
Shervine Amidi is an Adjunct Lecturer at Stanford University.
View the course playlist: https://www.youtube.com/playlist?list=PLoROMvodv4rOCXd21gf0CF4xr35yINeOy
Stanford CS224R Deep Reinforcement Learning | Spring 2025 | Lecture 1: Class Intro
View course details: https://online.stanford.edu/courses/xcs224r-deep-reinforcement-learning
April 2, 2025
This lecture covers:
• Class introduction
• Markov Decisions Processes
• Why study deep reinforcement learning?
• Intro to modeling behavior and reinforcement learning
To learn more about enrolling in the graduate course, visit: https://online.stanford.edu/courses/cs224r-deep-reinforcement-learning
To follow along with the course schedule and syllabus, visit:
https://cs224r.stanford.edu/
Chelsea Finn
Assistant Professor in Computer Science and Electrical Engineering at Stanford University and co-founder of Pi.
Stanford CS224R Deep Reinforcement Learning | Spring 2025 | Lecture 2: Imitation Learning
View course details: https://online.stanford.edu/courses/xcs224r-deep-reinforcement-learning
April 4, 2025
This lecture covers:
• Imitation learning basics
• Learning expressive policy distributions
• Learning from online interventions
To learn more about enrolling in the graduate course, visit: https://online.stanford.edu/courses/cs224r-deep-reinforcement-learning
To follow along with the course schedule and syllabus, visit:
https://cs224r.stanford.edu/
Chelsea Finn
Assistant Professor in Computer Science and Electrical Engineering at Stanford University and co-founder of Pi.
Stanford CS224R Deep Reinforcement Learning | Spring 2025 | Lecture 3: Policy Gradients
View course details: https://online.stanford.edu/courses/xcs224r-deep-reinforcement-learning
April 9, 2025
• Key intuition behind policy gradients
• How to implement, when to use policy gradients
To learn more about enrolling in the graduate course, visit: https://online.stanford.edu/courses/cs224r-deep-reinforcement-learning
To follow along with the course schedule and syllabus, visit:
https://cs224r.stanford.edu/
Chelsea Finn
Assistant Professor in Computer Science and Electrical Engineering at Stanford University and co-founder of Pi.
Stanford CS224R Deep Reinforcement Learning | Spring 2025 | Lecture 4: Actor-Critic Methods
View course details: https://online.stanford.edu/courses/xcs224r-deep-reinforcement-learning
April 11, 2025
This lecture covers:
• How to estimate how good a state and action are for a policy
• How to use those estimates to form a more efficient RL algorithm
To learn more about enrolling in the graduate course, visit: https://online.stanford.edu/courses/cs224r-deep-reinforcement-learning
To follow along with the course schedule and syllabus, visit:
https://cs224r.stanford.edu/
Chelsea Finn
Assistant Professor in Computer Science and Electrical Engineering at Stanford University and co-founder of Pi.
Stanford CS224R Deep Reinforcement Learning | Spring 2025 | Lecture 5: Off-Policy Actor Critic
View course details: https://online.stanford.edu/courses/xcs224r-deep-reinforcement-learning
April 16, 2025
This lecture covers:
• Off-policy actor critic methods
• All of the key concepts for practical algorithms like PPO and SAC
To learn more about enrolling in the graduate course, visit: https://online.stanford.edu/courses/cs224r-deep-reinforcement-learning
To follow along with the course schedule and syllabus, visit:
https://cs224r.stanford.edu/
Chelsea Finn
Assistant Professor in Computer Science and Electrical Engineering at Stanford University and co-founder of Pi.
Stanford CS224R Deep Reinforcement Learning | Spring 2025 | Lecture 6: Q-Learning
View course details: https://online.stanford.edu/courses/xcs224r-deep-reinforcement-learning
April 18, 2025
This lecture covers:
• How Q-functions relate to policies
• How to do RL without learning an explicit policy
• How to stabilize Q-learning in practice
To learn more about enrolling in the graduate course, visit: https://online.stanford.edu/courses/cs224r-deep-reinforcement-learning
To follow along with the course schedule and syllabus, visit:
https://cs224r.stanford.edu/
Chelsea Finn
Assistant Professor in Computer Science and Electrical Engineering at Stanford University and co-founder of Pi.
Stanford CS224R Deep Reinforcement Learning | Spring 2025 | Lecture 7: Offline RL
View course details: https://online.stanford.edu/courses/xcs224r-deep-reinforcement-learning
April 23, 2025
This lecture covers:
• Key challenges arising in offline reinforcement learning
• Two approaches for offline RL (& why they work!)
• How offline RL can improve over imitation learning
To learn more about enrolling in the graduate course, visit: https://online.stanford.edu/courses/cs224r-deep-reinforcement-learning
To follow along with the course schedule and syllabus, visit:
https://cs224r.stanford.edu/
Chelsea Finn
Assistant Professor in Computer Science and Electrical Engineering at Stanford University and co-founder of Pi.
Stanford CS224R Deep Reinforcement Learning | Spring 2025 | Lecture 8: Reward Learning
View course details: https://online.stanford.edu/courses/xcs224r-deep-reinforcement-learning
April 25, 2025
This lecture covers:
• Why task specification is hard (& why naïve methods fail)
• Methods for learning rewards from human supervision
To learn more about enrolling in the graduate course, visit: https://online.stanford.edu/courses/cs224r-deep-reinforcement-learning
To follow along with the course schedule and syllabus, visit:
https://cs224r.stanford.edu/
Chelsea Finn
Assistant Professor in Computer Science and Electrical Engineering at Stanford University and co-founder of Pi.
Stanford CS224R Deep Reinforcement Learning | Spring 2025 | Lecture 9: RL for LLMs
View course details: https://online.stanford.edu/courses/xcs224r-deep-reinforcement-learning
April 30, 2025
This guest lecture covers RL for LLMs: preference optimization.
To learn more about enrolling in the graduate course, visit: https://online.stanford.edu/courses/cs224r-deep-reinforcement-learning
To follow along with the course schedule and syllabus, visit:
https://cs224r.stanford.edu/
Archit Sharma
Researcher on Gemini team, Lead author of DPO
Stanford CS224R Deep Reinforcement Learning | Spring 2025 | Lecture 10: RL for LLM Reasoning
View course details: https://online.stanford.edu/courses/xcs224r-deep-reinforcement-learning
May 2, 2025
This lecture covers reinforcement learning for LLM reasoning.
To learn more about enrolling in the graduate course, visit: https://online.stanford.edu/courses/cs224r-deep-reinforcement-learning
To follow along with the course schedule and syllabus, visit:
https://cs224r.stanford.edu/
Aviral Kumar
Professor at Carnegie Mellon University
Stanford CS224R Deep Reinforcement Learning | Spring 2025 | Lecture 11: Model-Based RL
View course details: https://online.stanford.edu/courses/xcs224r-deep-reinforcement-learning
May 7, 2025
This lecture covers:
• How to learn and use dynamic models
• The key challenges and trade-offs arising in model-based RL
To learn more about enrolling in the graduate course, visit: https://online.stanford.edu/courses/cs224r-deep-reinforcement-learning
To follow along with the course schedule and syllabus, visit:
https://cs224r.stanford.edu/
Chelsea Finn
Assistant Professor in Computer Science and Electrical Engineering at Stanford University and co-founder of Pi.
