You are here: Start > EIC0096

Site map

Programa de formação da Biblioteca para o segundo semestre já está disponível

Options

Knowledge Extraction and Machine Learning

Code:

EIC0096

Acronym:

ECAC

Keywords
Classification	Keyword
OFICIAL	Artificial Intelligence

Instance: 2013/2014 - 1S (of 09-09-2013 to 20-12-2013)

Active?	Yes
E-learning page:	http://moodle.up.pt/
Responsible unit:	Department of Informatics Engineering
Course/CS Responsible:	Master in Informatics and Computing Engineering

Cycles of Study/Courses

Acronym	No. of Students	Study Plan	Curricular Years	Credits UCN	Credits ECTS	Contact hours	Total Time
MIEIC	13	Syllabus since 2009/2010	5	-	6	56	162

Teaching language

Suitable for English-speaking students

Objectives

Background

After a season in which the different companies / institutions lot invested in data collection within the computerization of their operations, there is now the need to put this data in the service of these companies / institutions. The goal is to be able to extract knowledge from data, improving efficiency and gaining competitive advantage. It is this need that arises the Course (UC) Knowledge Extraction and Computational Learning (ECAC).

Objectives

Motivate to the use of techniques of knowledge extraction (EC) data, or data mining in decision support.

Develop the ability to properly utilize these techniques for automated analysis of large amounts of data.

Component distribution

Scientific Component: 70%

Technologycal Component: 30%

Learning outcomes and competences

Students should be able to

Understand the different types of EC tasks.
Identify decision support problems that can be represented as EC tasks.
Understnad the phases of a EC project.
Know the main methods / algorithms for each EC task type and understand the basics of their behavior.
Apply these methods to decision support problems.
Evaluate the results of a EC project.

Working method

Presencial

Pre-requirements (prior knowledge) and co-requirements (common knowledge)

Although no particular UC in concrete is required, it is useful to have attended any UC on introduction to statistics;

It is also important that the student has basic knowledge of algorithms.

Program

PART I - Descriptive Data Mining

Introduction to knowledge extraction/data mining.

Clustering: Partitional (review of K-means, K-medoids) and hierarchical algorithms. Other algorithms. Evaluation measures.

Association Rules: Apriori algorithm. Other algorithms. Evaluation measures.

Methodologies for Data Mining: The process of knowledge extraction. CRISP-DM. Project management.

Pre-processing of data: Data cleansing and data transformation (normalization, reduction and discretization).

Part II - Predictive Data Mining

Evaluation of predictive models: Review of decision trees. Overfitting in decision trees. Evaluation methodologies.

Classification: Classification algorithms (rule-, instance- and kernel-based methods, Bayesian methods). Common Issues in classification (unbalanced distribution of classes and costs). Evaluation measures.

Regression: Regression algorithms (linear and non-linear regression, regression trees, MARS). Evaluation measures.

Outlier detection: Definitions of outliers. Algorithms for detection of outliers.

Multiple models: Ensemble learning. Metalearning.

Trends in predictive models: Deep learning.

Part III - Complex data

Text mining: Representation of data for text mining. Evaluation measures.

Web mining and recommendation systems.

Mining social networks: Representation of data from social networks. Algorithms for extracting knowledge from social networks.

Inductive Logic Programming and Relational Knowledge Extraction: Non-propositional data representation. Algorithms for knowledge extraction from non-propositional data.

Mining Data Streams and time series: Time series and data streams. Algorithms for extracting knowledge from time series and streaming data. Evaluation methodologies and measures.

Mandatory literature

Jiawei Han, Micheline Kamber; Data mining. ISBN: 1-55860-489-8

Complementary Bibliography

Ian H. Witten, Eibe Frank; Data mining. ISBN: 1-55860-552-5
Peter Flach; Machine Learning: The Art and Science of Algorithms that Make Sense of Data, Cambridge University Press, 2012. ISBN: 9781107422223 (http://www.cs.bris.ac.uk/~flach/mlbook/)
Mohammed Zaki and Wagner Meira Jr.; Data Mining and Analysis: Fundamental Concepts and Algorithms, Cambridge University Press, 2013. ISBN: 9780521766333 (http://www.dcc.ufmg.br/miningalgorithms/DokuWiki/doku.php)

Teaching methods and learning activities

Theoretical presentation and discussion of the concepts.

Laboratory sessions for practical application of the concepts learned.

Software

RapidMiner 5
The R Project for Statistical Computing

Evaluation Type

Distributed evaluation with final exam

Assessment Components

Designation	Weight (%)
Exame	50,00
Participação presencial	0,00
Trabalho escrito	50,00
Total:	100,00

Amount of time allocated to each course unit

Designation	Time (hours)
Estudo autónomo	60,00
Frequência das aulas	42,00
Trabalho laboratorial	60,00
Total:	162,00

Eligibility for exams

The distributed evaluation consists of the development of a practical project. When a student misses a component of the distributed evaluation, the grade is assigned to 0 (zero) values. Students with Worker statute that do not go regularly to the classes should present regularly the evolution of their work, and should make their presentation, simultaneously with the ordinary students.

Calculation formula of final grade

0.5* Assignment Grade + 0.5* Exam Grade

Examinations or Special Assignments

The examination will be conducted without access to any materials. The assignment consists in the analysis of a dataset and the preparation of a final report that describes and discusses the project and the corresponding results. The project is worth 50% of the final grade, divided as follows:

problem presentation: 8/Nov - 5%

data preparation: 15/Nov - 5%

evolution of the project in class: 6/Dec - 10%

final presentation: 20/Dec - 5%

report: 25%

Special assessment (TE, DA, ...)

Students with worker statute or equivalent must take the exame and carry out the project.

Classification improvement

The improvement of the distributed classification can only be done in the following year.

Recommend this page Top

Copyright 1996-2025 © Faculdade de Engenharia da Universidade do Porto I Terms and Conditions I Accessibility I Index A-Z I Guest Book
Page generated on: 2025-06-14 at 23:54:52 | Acceptable Use Policy | Data Protection Policy | Complaint Portal