You are here: Start > M.EEC026

Site map

Clube de Leitura: Vamos a Livros || Verão com Histórias Dentro

Options

Dynamic Programming and Learning for Decision and Control

Code:

M.EEC026

Acronym:

PDADD

Keywords
Classification	Keyword
OFICIAL	Automation and Control

Instance: 2024/2025 - 1S

Active?	Yes
Responsible unit:	Department of Electrical and Computer Engineering
Course/CS Responsible:	Master in Electrical and Computer Engineering

Cycles of Study/Courses

Acronym	No. of Students	Study Plan	Curricular Years	Credits UCN	Credits ECTS	Contact hours	Total Time
M.EEC	7	Syllabus	2	-	6	39

Teaching Staff - Responsibilities

Teacher	Responsibility
Fernando Arménio da Costa Castro e Fontes

Teaching - Hours

Recitations:

3,00

Type	Teacher	Classes	Hour
Recitations	Totals	1	3,00
Recitations	Fernando Arménio da Costa Castro e Fontes		3,00

Teaching language

English

Objectives

This UC aims to transpose the acquired bases in control, 
optimization, dynamic systems (differential or with 
discrete events), deterministic or stochastic to the 
operational aspect in order to deal with the computational 
complexity inherent to optimization and exploration 
processes, using machine learning techniques, namely reinforcement learning.

Learning outcomes and competences

Acquisition by students of fundamental knowledge for the 
design and development of support systems for the management 
and control of dynamic systems having as dynamic programming 
as a central element, as well as the various approximating 
approaches, generically called "reinforcement learning" 
that promote different trade-offs between exploration and 
optimization.

Part of the sub-objectives are, on the one hand, to 
establish a link with previously offered curricular 
subjects – essentially, dynamic systems, control, 
optimization, systems with random variables, and Markov 
chains – and, on the other hand, how to link with neural 
networks as an efficient way to operationalize the 
presented methods from a computational point of view.

Working method

Presencial

Pre-requirements (prior knowledge) and co-requirements (common knowledge)

Linear Algebra, Probabilities and Statistics, Control. Knowledge of machine learning will be useful, but is not an essential prerequisite.

Program

1. Introduction.
Clarification – through examples – how the contents of this UC allow to operationalize knowledge of previous courses, namely Control
2. Review and complement of knowledge on Controlled Markov Chains.
Definition as stochastic automata timed. Transition Probabilities matrix. Transitional and permanent regimes.
Applications to control and optimize. Markov's Decision Processes.
3. Dynamic Programming
General basic concepts in discrete contexts : cost-to-go function and principle of optimality.
Methods of solving the Bellman equation. Basic dynamic programming algorithms for discrete
problems. Example of the case of the Quadratic Linear problem. Types of dynamic programming problems: Shorter stochastic path, and discounted cost.
4. Neuronal network architectures and training methods.
Architectures for approximation of the value function through multilevel neuronal networks. Training methods of
neuronal networks.
5. Iterative stochastic algorithms.
Basic model. Convergence based on smooth potential function. Convergence via contraction and monotony properties.
The approach of the common differential equation.
6. Simulation methods. Evaluation of policies by Monte Carlo simulation. Method of temporal differences. Iteration of
optimistic policies. Iteration of the value by simulation. Q-Learning.

Mandatory literature

Richard S. Sutton and Andrew G. Barto; Reinforcement Learning: An Introduction , MIT press, 2018

Complementary Bibliography

Dimitri Bertsekas; Reinforcement Learning and Optimal Control , Athena , 2019
Bertsekas, D. P.; Dynamic Programming and Optimal Control (3rd ed)., Athena Scientific, 2005
Bertsekas, D. P., & Tsitsikis, J. N.; Neuro-Dynamic Programming, Athena Scientific, 1996
Cassandras, C.G., Lafortune, S.; Introduction to Discrete Event Systems (2nd ed), Springer, 2008

Teaching methods and learning activities

Exposition classes: Presentation and discussion of the various topics of the curricular unit. Detailed explanation of examples of application of concepts and methods.
Exercises solving classes: Practical execises are solved by the students with the support of the teacher by clarifying the issues that they might raise. Follow-up of the work in the mini projects support by the use of OCTAVE/MATLAB e Python.

Software

Octave, MATLAB
Matlab
Python

keywords

Physical sciences > Mathematics > Applied mathematics
Technological sciences > Engineering > Electrical engineering
Technological sciences > Engineering > Systems engineering > Systems theory
Technological sciences > Engineering > Control engineering > Automation

Evaluation Type

Distributed evaluation without final exam

Assessment Components

Designation	Weight (%)
Trabalho prático ou de projeto	50,00
Participação presencial	10,00
Teste	40,00
Total:	100,00

Amount of time allocated to each course unit

Designation	Time (hours)
Elaboração de projeto	40,00
Estudo autónomo	83,00
Frequência das aulas	39,00
Total:	162,00

Eligibility for exams

Frequency is obtained through remote participation in at 
least 75% of the PL classes and through participation in 
the mini-project.

Calculation formula of final grade



The final assessment has three components:
TE - Written Test on the scale of 0 to 20 values with a weight of 40%
MP - Mini-project developed in a group on a scale of 0 to 20 values with a weight of 50%
CC - Continuous Component on a scale of 0 to 20 values with a weight of 10%

Final Rating = 0.4 EF + 0.5 MP +0.1 CC

The Continuous Component is measured by the degree of participation in Class TP.

Examinations or Special Assignments

Mini project of a learning-based control system, supported by OCTAVE/MATLAB, Python software or another to be defined.

Internship work/project

Classification improvement

Students will be able to take a new written test, replacing the previous TE grade, being able, in addition, to improve the work, individually.

Recommend this page Top

Copyright 1996-2025 © Faculdade de Engenharia da Universidade do Porto I Terms and Conditions I Accessibility I Index A-Z I Guest Book
Page generated on: 2025-06-14 at 18:07:36 | Acceptable Use Policy | Data Protection Policy | Complaint Portal