| LEC # | TOPICS | LECTURE NOTES |
|---|---|---|
| 1 | Markov Decision Processes Finite-Horizon Problems: Backwards Induction Discounted-Cost Problems: Cost-to-Go Function, Bellman's Equation | (PDF) |
| 2 | Value Iteration Existence and Uniqueness of Bellman's Equation Solution Gauss-Seidel Value Iteration | (PDF) |
| 3 | Optimality of Policies derived from the Cost-to-Go Function Policy Iteration Asynchronous Policy Iteration | (PDF) |
| 4 | Average-Cost Problems Relationship with Discounted-Cost Problems Bellman's Equation Blackwell Optimality | (PDF) |
| 5 | Average-Cost Problems Computational Methods | (PDF) |
| 6 | Application of Value Iteration to Optimization of Multiclass Queueing Networks Introduction to Simulation-based Methods Real-Time Value Iteration | (PDF) |
| 7 | Q-Learning Stochastic Approximations | (PDF) |
| 8 | Stochastic Approximations: Lyapunov Function Analysis The ODE Method Convergence of Q-Learning | (PDF) |
| 9 | Exploration versus Exploitation: The Complexity of Reinforcement Learning | (PDF) |
| 10 | Introduction to Value Function Approximation Curse of Dimensionality Approximation Architectures | (PDF) |
| 11 | Model Selection and Complexity | (PDF) |
| 12 | Introduction to Value Function Approximation Algorithms Performance Bounds | (PDF) |
| 13 | Temporal-Difference Learning with Value Function Approximation | (PDF) |
| 14 | Temporal-Difference Learning with Value Function Approximation (cont.) | (PDF) |
| 15 | Temporal-Difference Learning with Value Function Approximation (cont.) Optimal Stopping Problems General Control Problems | (PDF) |
| 16 | Approximate Linear Programming | (PDF) |
| 17 | Approximate Linear Programming (cont.) | (PDF) |
| 18 | Efficient Solutions for Approximate Linear Programming | (PDF) |
| 19 | Efficient Solutions for Approximate Linear Programming: Factored MDPs | (PDF) |
| 20 | Policy Search Methods | (PDF) |
| 21 | Policy Search Methods (cont.) | (PDF) |
| 22 | Policy Search Methods for POMDPs Application: Call Admission Control Actor-Critic Methods | |
| 23 | Approximate POMDP Compression | |
| 24 | Policy Search Methods: PEGASUS Application: Helicopter Control |
