Go to:
Logótipo
You are in:: Start > M4114

Statistics and Data Analysis

Code: M4114     Acronym: M4114     Level: 400

Keywords
Classification Keyword
OFICIAL Mathematics

Instance: 2024/2025 - 2S Ícone do Moodle

Active? Yes
Responsible unit: Department of Mathematics
Course/CS Responsible: Master in Data Science

Cycles of Study/Courses

Acronym No. of Students Study Plan Curricular Years Credits UCN Credits ECTS Contact hours Total Time
M:DS 32 Official Study Plan since 2018_M:DS 1 - 6 42 162

Teaching Staff - Responsibilities

Teacher Responsibility
Joaquim Fernando Pinto da Costa
Óscar António Louro Felgueiras

Teaching - Hours

Theoretical and practical : 3,23
Type Teacher Classes Hour
Theoretical and practical Totals 1 3,231
Óscar António Louro Felgueiras 1,615
Joaquim Fernando Pinto da Costa 1,615

Teaching language

English

Objectives

Train students in multivariate data analysis methods in order to extract essential information from a potentially voluminous set of data with a focus on supervised and unsupervised learning methods.

Learning outcomes and competences

1. Understanding  the theoretical foundations of the methodologies taught.
2. Ability to extract  essential information from a set of real data, using the methodologies taught

And in particular:
- Recognize different problems of multivariate data analysis  and solve them using the methods addressed and using software R;
- Prepare, solve and present computational data mining projects, where the various models presented are discussed, evaluated and compared in concrete cases.
- Solve computational and non-computational exercises on the methodologies addressed

Working method

Presencial

Pre-requirements (prior knowledge) and co-requirements (common knowledge)

Previous knowledge on random variables, probability distribution, sample statistics, confidence intervals and hypothesis tests is required. Those are usual contents of an introductory course on Probability and Statistics for undergrduate students. 

Program

Exploratory (preliminar) data analysis

Factorial Analysis :
Principal Component Analysis;
Simple Correspondence Analysis;  
Multiple Correspondence Analysis;
Multidimensional Scaling.

Cluster Analysis:
Comparison measures;
Hierarchical Clustering;
Non-Hierarchical Clustering,
Model-based Custering

Discriminant Analysis:
Discriminant Analysis in 2 groups;
Discriminant Analysis in K groups;
Decision Trees.

 

Mandatory literature

apontamentos escritos disponibilizados pelos professores
James Gareth 070; An introduction to statistical learning. ISBN: 978-1-4614-7137-0
Everitt Brian S.; Applied multivariate data analysis. ISBN: 978-0-470-71117-0
000040365. ISBN: 0-387-95284-5

Complementary Bibliography

000098707. ISBN: 978-0-521-86116-8
Sharma, Subhash; Applied multivariate techniques. ISBN: 0-471-31064-6
Hair Jr Joseph F.; Multivariate data analysis. ISBN: 0-13-515309-3
Jianqing Fan and Runze Li and Cun-Hui Zhang ; Statistical Foundation of Data Science , Chapman and Hall/CRC; 1 edition, 2019. ISBN: 978-1466510845

Teaching methods and learning activities

Classes will be simultaneously theoretical and practical, with several examples of application and  making use of statistical software. 
The used software will be the free programming language R.

Software

R Project

keywords

Physical sciences > Mathematics > Statistics

Evaluation Type

Distributed evaluation with final exam

Assessment Components

designation Weight (%)
Trabalho prático ou de projeto 40,00
Exame 60,00
Total: 100,00

Amount of time allocated to each course unit

designation Time (hours)
Estudo autónomo 80,00
Frequência das aulas 42,00
Apresentação/discussão de um trabalho científico 2,00
Trabalho escrito 38,00
Total: 162,00

Eligibility for exams

Attendency is not mandatory.

Practical work and its oral presentation are compulsory in the normal season, appeal season, and special seasons.

Calculation formula of final grade

1. Evaluation will be distributed with a final examination. There is also  an exam in  the second evaluation period (“época de recurso”).

2. Grade Improvement: Students who want to improve their exam classification can attend the second exam ("época de recurso"). The work cannot be improved.


Final Score: 0.6* Score_of_exam + 0.4*Score_of_work.

The same formula applies also for the second  exam (appeal season, "época de recurso"), and special seasons.

Practical work and its oral presentation are compulsory in the normal season, appeal season ("recurso") and special seasons.

Approval is subject to the value of Score_of_exam being equal to or higher than 7.5 values (on a scale of 0 to 20).

The practical work consists of the analysis of a real database, using the methods taught, using software.
It should be done by groups of 2 students.

Classification improvement

Improvement of the final mark: students that  have succeed may attend the exam  (“época de recurso”) in order to improve their exam mark. 
The mark obtained in the written assignment/project cannot be improved.
The evaluation formula is the same (see above).
Recommend this page Top
Copyright 1996-2025 © Faculdade de Ciências da Universidade do Porto  I Terms and Conditions  I Acessibility  I Index A-Z  I Guest Book
Page created on: 2025-06-16 at 00:39:25 | Acceptable Use Policy | Data Protection Policy | Complaint Portal