Data Mining I
Keywords |
Classification |
Keyword |
OFICIAL |
Information Technology |
Instance: 2020/2021 - 1S
Cycles of Study/Courses
Teaching language
English
Objectives
At the end of the semester students should have the knowledge of various Data Mining tasks, the main methods and algorithms for each task, be able to apply these methods to new specific data analysis problems and have the capacity to evaluate, apply a critical posture in relation to results.
Learning outcomes and competences
Knowledge of various Data Mining tasks, the main methods and algorithms for each task, be able to apply these methods to new specific data analysis problems and have the capacity to evaluate, apply a critical posture in relation to results.
Working method
Presencial
Program
- Machine learning, data mining. - From OTLP to OLAP. Multidimensional databases. - Knowledge: Representation. Generalization and specialization. - Data: Examples and instances of concepts. Attributes and values. Signal and noise. Multi-relational representations. Types of attributes. Data formats for data mining systems. Exploratory data analysis. - Distance-based methods. Algorithm k-NN and its properties. - Probabilistic methods. Bayesian classifiers. - Methods based on search. Decision trees and rules. Covering algorithm. - Evaluation of classification models. Costs. Overfitting. - Pre-processing: Feature selection, discretization, dealing with unknown values and outliers. - Advanced topics of classification. - Regression. Evaluation of regression models. - Frequent patterns, association rules. - Cluster analysis. - Methods for combination of models. Voting methods. Methods based on samples. Hierarchical combination of models. Data Mining tools. - Methodologies of data mining projects (CRISP-DM).
Mandatory literature
J. Gama, A. Carvalho, K. Faceli, A. Lorena, M. Oliveira; Extração de Conhecimento de Dados, Silabo, 2012. ISBN: 978-972-618-698-4
Jiawei Han; Data Mining: Concepts and Techniques, Moegan Kaufman, 2006
Tom M. Mitchell; Machine learning, McGraw Hill, 1997
an Witten, Eibe Frank; Data Mining: practical machine learning tools and Techniques with java implementations, Morgan Kaufmann, 2000
David Hand, Heikki Mannila; Padhraic Smyth;Principles of Data Mining, MIT Press, 2001
Teaching methods and learning activities
Theoretical and practical classes
Software
Weka -> http://www.cs.waikato.ac.nz/ml/weka/
R -> http://www.r-project.org/
Knime
keywords
Technological sciences > Technology > Information technology
Physical sciences > Computer science > Cybernetics > Artificial intelligence
Evaluation Type
Distributed evaluation with final exam
Assessment Components
Designation |
Weight (%) |
Exame |
50,00 |
Participação presencial |
0,00 |
Trabalho escrito |
50,00 |
Total: |
100,00 |
Amount of time allocated to each course unit
Designation |
Time (hours) |
Elaboração de projeto |
50,00 |
Frequência das aulas |
42,00 |
Trabalho escrito |
50,00 |
Apresentação/discussão de um trabalho científico |
15,00 |
Estudo autónomo |
5,00 |
Total: |
162,00 |
Eligibility for exams
Weighted sum of home works and exam >= 9.5
0.5 * Exam + 0.5 (average of the works)
Calculation formula of final grade
Exam 50% Practical work 50% The mark of practical work is calculated as the mean of practical works carried out. The practical work not done has a mark of 0. The minimum mark of the exam is 6.5. The minimum mark of the practical work is 6.5