Decision Making in Large Scale Systems
Lecture Notes
LEC # |
TOPICS |
LECTURE NOTES |
1 |
Markov Decision Processes |
(PDF) |
2 |
Value Iteration |
(PDF) |
3 |
Optimality of Policies derived from the Cost-to-Go Function |
(PDF) |
4 |
Average-Cost Problems |
(PDF) |
5 |
Average-Cost Problems |
(PDF) |
6 |
Application of Value Iteration to Optimization of Multiclass Queueing Networks |
(PDF) |
7 |
Q-Learning |
(PDF) |
8 |
Stochastic Approximations: Lyapunov Function Analysis |
(PDF) |
9 |
Exploration versus Exploitation: The Complexity of Reinforcement Learning |
(PDF) |
10 |
Introduction to Value Function Approximation |
(PDF) |
11 |
Model Selection and Complexity |
(PDF) |
12 |
Introduction to Value Function Approximation Algorithms |
(PDF) |
13 |
Temporal-Difference Learning with Value Function Approximation |
(PDF) |
14 |
Temporal-Difference Learning with Value Function Approximation (cont.) |
(PDF) |
15 |
Temporal-Difference Learning with Value Function Approximation (cont.) |
(PDF) |
16 |
Approximate Linear Programming |
(PDF) |
17 |
Approximate Linear Programming (cont.) |
(PDF) |
18 |
Efficient Solutions for Approximate Linear Programming |
(PDF) |
19 |
Efficient Solutions for Approximate Linear Programming: Factored MDPs |
(PDF) |
20 |
Policy Search Methods |
(PDF) |
21 |
Policy Search Methods (cont.) |
(PDF) |
22 |
Policy Search Methods for POMDPs |
|
23 |
Approximate POMDP Compression |
|
24 |
Policy Search Methods: PEGASUS |
Assignments
Problem Set 1 (PDF)
Problem Set 2 (PDF)
Problem Set 3 (PDF)
Problem Set 4 (PDF)
Problem Set 5 (PDF)
Projects
The final project consists of a 10-15 page project report and 15-20 minute presentation. Students have option of working on theory, algorithms and / or applications. Project proposals are submitted midway through the term, with the final project due at the end of the term.
Some representative projects are presented in the table below, courtesy of the student author(s).
TOPICS |
STUDENTS |
PROJECTS |
Approximate Dynamic Programming (Via Linear Programming) for Stochastic Scheduling |
Mohamed Mostagir |
Paper (PDF) (Courtesy of Mohamed Mostagir and Nelson Uhan. Used with permission.) |
How to choose the State Relevance Weight in the Approximate Linear Programming Approach for Dynamic Programming? |
Yann Le Tallec |
Paper (PDF) (Courtesy of Yann Le-Tallec and Theophane Weber. Used with permission.) |
Decentralized Strategies for the Assignment Problem |
Hariharan Lakshmanan |
Slides (PDF) (Courtesy of Hariharan Lakshamanan. Used with permission.) |