Decision Making in Large Scale Systems

Lecture Notes

LEC #	TOPICS	LECTURE NOTES
1	Markov Decision Processes Finite-Horizon Problems: Backwards Induction Discounted-Cost Problems: Cost-to-Go Function, Bellman's Equation	(PDF)
2	Value Iteration Existence and Uniqueness of Bellman's Equation Solution Gauss-Seidel Value Iteration	(PDF)
3	Optimality of Policies derived from the Cost-to-Go Function Policy Iteration Asynchronous Policy Iteration	(PDF)
4	Average-Cost Problems Relationship with Discounted-Cost Problems Bellman's Equation Blackwell Optimality	(PDF)
5	Average-Cost Problems Computational Methods	(PDF)
6	Application of Value Iteration to Optimization of Multiclass Queueing Networks Introduction to Simulation-based Methods Real-Time Value Iteration	(PDF)
7	Q-Learning Stochastic Approximations	(PDF)
8	Stochastic Approximations: Lyapunov Function Analysis The ODE Method Convergence of Q-Learning	(PDF)
9	Exploration versus Exploitation: The Complexity of Reinforcement Learning	(PDF)
10	Introduction to Value Function Approximation Curse of Dimensionality Approximation Architectures	(PDF)
11	Model Selection and Complexity	(PDF)
12	Introduction to Value Function Approximation Algorithms Performance Bounds	(PDF)
13	Temporal-Difference Learning with Value Function Approximation	(PDF)
14	Temporal-Difference Learning with Value Function Approximation (cont.)	(PDF)
15	Temporal-Difference Learning with Value Function Approximation (cont.) Optimal Stopping Problems General Control Problems	(PDF)
16	Approximate Linear Programming	(PDF)
17	Approximate Linear Programming (cont.)	(PDF)
18	Efficient Solutions for Approximate Linear Programming	(PDF)
19	Efficient Solutions for Approximate Linear Programming: Factored MDPs	(PDF)
20	Policy Search Methods	(PDF)
21	Policy Search Methods (cont.)	(PDF)
22	Policy Search Methods for POMDPs Application: Call Admission Control Actor-Critic Methods
23	Approximate POMDP Compression
24	Policy Search Methods: PEGASUS Application: Helicopter Control

Assignments

Problem Set 1 (PDF)

Problem Set 2 (PDF)

Problem Set 3 (PDF)

Problem Set 4 (PDF)

Problem Set 5 (PDF)

Projects

The final project consists of a 10-15 page project report and 15-20 minute presentation. Students have option of working on theory, algorithms and / or applications. Project proposals are submitted midway through the term, with the final project due at the end of the term.

Some representative projects are presented in the table below, courtesy of the student author(s).

TOPICS	STUDENTS	PROJECTS
Approximate Dynamic Programming (Via Linear Programming) for Stochastic Scheduling	Mohamed Mostagir Nelson Uhan	Paper (PDF) (Courtesy of Mohamed Mostagir and Nelson Uhan. Used with permission.) Slides (PDF) (Courtesy of Mohamed Mostagir and Nelson Uhan. Used with permission.)
How to choose the State Relevance Weight in the Approximate Linear Programming Approach for Dynamic Programming?	Yann Le Tallec Theophane Weber	Paper (PDF) (Courtesy of Yann Le-Tallec and Theophane Weber. Used with permission.) Slides (PDF) (Courtesy of Yann Le-Tallec and Theophane Weber. Used with permission.)
Decentralized Strategies for the Assignment Problem	Hariharan Lakshmanan	Slides (PDF) (Courtesy of Hariharan Lakshamanan. Used with permission.)

Nav view search

Navigation

Search

Decision Making in Large Scale Systems

Assignments