Go to:
Logótipo
You are in:: Start > CC4018

Data Mining I

Code: CC4018     Acronym: CC4018     Level: 400

Keywords
Classification Keyword
OFICIAL Computer Science

Instance: 2022/2023 - 1S Ícone do Moodle

Active? Yes
Responsible unit: Department of Computer Science
Course/CS Responsible: Master in Computer Science

Cycles of Study/Courses

Acronym No. of Students Study Plan Curricular Years Credits UCN Credits ECTS Contact hours Total Time
M:A_ASTR 8 Plano de Estudos oficial desde_2013/14 1 - 6 42 162
2
M:CC 25 Study plan since 2014/2015 1 - 6 42 162
M:ERSI 10 Official Study Plan since 2021_M:ERSI. 1 - 6 42 162

Teaching language

Suitable for English-speaking students

Objectives

This unit has as main objectives to provide an introduction to the main data science methodologies and also to convey knowledge on programming and tools for data analysis, such as the R language.

Learning outcomes and competences

This unit should provide the students with: 
1. theoretical competences on several basic methodologies of data science. 
2. competences for developing software for data science tasks. 
3. practical competences on applying data science techniques to specific problems. 

Working method

Presencial

Program

 
1. Introduction to Data Science:
• the CRISP-DM model
• data, models and patterns
• data science tasks.

2. Data Pre-Processing:
• importing data
• cleaning data
• transforming and creating variables
• dimensionality reduction techniques

3. Exploring and Visualizing Data
• data summarization
• data visualization

4. Descriptive Models
•  clustering methods: partitional methods, hierarchical methods 

5. Predictive Models
• classification and regression tasks
• evaluation  metrics
• linear regression models, naive Bayes, k-nearest neighbours
• tree-based models: classification and regression trees, pruning methods 
• neural networks and deep learning
• support vector machines
• ensembles: bagging, random forests, boosting, AdaBoost, Xgboost 

6. Methodologies for Evaluating and Comparing Models
• evaluation measures
• estimation methods
• significance tests


Mandatory literature

Charu C. Aggarwal; Data mining. ISBN: 978-3-319-14142-8
Jiawei Han; Data mining. ISBN: 978-0-12-381479-1
Torgo Luís; Data mining with R. ISBN: 978-1-4398-1018-7

Complementary Bibliography

Peter Flach; Machine learning. ISBN: 978-1-107-42222-3
Andriy Burkov; The Hundred-Page Machine Learning Book, 2019. ISBN: 978-1999579500

Teaching methods and learning activities

The lectures are based on the oral exposition of the topics that are part of the syllabus, as well as illustrations with concrete data mining case studies. 

Software

Rstudio - IDE para o R
R - software para análise de dados

keywords

Technological sciences > Technology > Information technology
Physical sciences > Computer science > Modelling tools
Physical sciences > Computer science > Informatics > Applied informatics
Technological sciences > Technology > Computer technology > Software technology

Evaluation Type

Distributed evaluation with final exam

Assessment Components

designation Weight (%)
Teste 35,00
Trabalho prático ou de projeto 30,00
Exame 35,00
Total: 100,00

Amount of time allocated to each course unit

designation Time (hours)
Elaboração de projeto 35,00
Estudo autónomo 84,00
Apresentação/discussão de um trabalho científico 1,00
Frequência das aulas 42,00
Total: 162,00

Eligibility for exams

The practical assignment is mandatory with a minimum grade of 30%.

Calculation formula of final grade

The assessment of the course is distributed, consisting of a midterm test during the semester, a final exam and a practical assignment at the end of the semester.

The final grade is calculated by the weighted average of the practical and theoretical grades through the formula:

NF = 0.35 * TI + 0.35 * Ex + 0.30 * TP

on what,
TI is the midterm test grade
Ex is final exam grade and
TP is the practical assignment grade.

Students who do not obtain a minimum of 30% in each component, i.e. 6 out of 20, will not be approved.

The supplementary exam will be quoted to 70% (14 out of 20) of the final grade.

Examinations or Special Assignments

The midterm test will take place in one of the classes, in the middle of the semester.

The practical assignment will be announced in the middle of the semester and should be completed by the end of the semester.

Special assessment (TE, DA, ...)

Students that are in special situations according to the legislation, can arrange to have the midterm test on a date different from the established one.

Classification improvement

The evaluation of the practical assignment is not subject to improvement. 

The student can improve in the theoretical grade by taking the supplementary exam.

Observations

All of the provided material (e.g. slides, recommended books) is given in English and if there are foreign students the classes will also be given in English.

Recommend this page Top
Copyright 1996-2025 © Faculdade de Ciências da Universidade do Porto  I Terms and Conditions  I Acessibility  I Index A-Z  I Guest Book
Page created on: 2025-06-14 at 09:41:22 | Acceptable Use Policy | Data Protection Policy | Complaint Portal