Lectures and discussions related to advanced topics and new areas of interest in decision and control theory: hybrid, sampled-data, and fault tolerant systems; control over networks; vision-based control; system estimation and identification; dynamic games. Course Information: May be repeated up to 12 hours within a term, and up to 20 hours total for the course. Credit towards a degree from multiple offerings of this course is not given if those offerings have significant overlap, as determined by the ECE department. Prerequisite: As specified each term. It is expected that each offering will have a 500-level course as prerequisite or co-requisite.
Topic: MDPs, Reinforcement Learning. Prerequisites: ECE 534.
The course will discuss techniques to solve dynamic optimization problems where the system dynamics are unknown. The course will first introduce dynamic programming techniques for Markov decision problems and then focus on solving the dynamic programming equations approximately when the underlying parameters of the Markov chain are unknown. While the emphasis will be on techniques for which one can prove performance bounds, heuristics used in reinforcement learning will also be presented to show their relationship to existing theory, and to identify open theoretical problems.