Go to:
Logótipo
You are here: Start > EIC0096

Knowledge Extraction and Machine Learning

Code: EIC0096     Acronym: ECAC

Keywords
Classification Keyword
OFICIAL Artificial Intelligence

Instance: 2010/2011 - 1S

Active? Yes
Web Page: http://www.fe.up.pt/~ec/index.html
Responsible unit: Department of Industrial Engineering and Management
Course/CS Responsible: Master in Informatics and Computing Engineering

Cycles of Study/Courses

Acronym No. of Students Study Plan Curricular Years Credits UCN Credits ECTS Contact hours Total Time
MIEIC 20 Syllabus since 2009/2010 5 - 6 56 162

Teaching language

Portuguese

Objectives

To provide the students with knowledge so that they can use analysis and extraction techniques of large data quantities’ patterns.

Program

Introduction to the knowledge extraction: Concept of Data Mining; Data Mining and Knowledge Discovery process.

Data preparation: Data cleaning; Data Normalization, Reduction and Discretization.

Association Rules: Definition of the association rules research problem. Quality measures of the association rules. Some research algorithms of association rules.

Clustering: Clustering Techniques. Partition clustering algorithms (K-means, K-medoids) and Hierarchical clustering quality. Other algorithms: BIRCH, CURE, DBSCAN.

Web Mining: Data Mining concepts on the Web; Information research on the Web; Research usage patterns on the Web; Structure analysis and research on the Web.

Classification: Classification techniques for the analysis of large data quantities; Decision Trees;
Classification and Regression Trees (CART); Pruning principles; Bayesian Classification. Inductive Logic Programming.

Relational Data Mining using Inductive Logic Programming.

PKDD: Parallel Knowledge Discovery in Databases – Parallel Processing Techniques for the extraction of patterns in large data quantities.

KDD Applications.

Mandatory literature

Han, Jiawei; Data mining. ISBN: 1-55860-489-8

Complementary Bibliography

Ian H. Witten abd Eibe Frank; Data Mining, Practical Machine Learning Tools and Techniques, Elsevier, 2005. ISBN: 0120884070

Teaching methods and learning activities

Theoretical classes: Exposition of theoretical concepts.
Practical classes: Exercise resolution, discussion of themes presented in the theoretical classes and help on the practical assignments.

Software

Weka 3: Data Mining Software in Java
The R Project for Statistical Computing
SPSS 17.0
RapidMiner 5

Evaluation Type

Distributed evaluation with final exam

Assessment Components

Description Type Time (hours) Weight (%) End date
Attendance (estimated) Participação presencial 39,00
Project Trabalho escrito 60,00
Final Exam Exame 3,00
Total: - 0,00

Amount of time allocated to each course unit

Description Type Time (hours) End date
Studying Estudo autónomo 60
Total: 60,00

Eligibility for exams

The average grade of the distributed evaluation component must be equal or superior to 6 marks.

Calculation formula of final grade

0.5* Assignment Grade + 0.5* Exam Grade

Examinations or Special Assignments

The assignment consists in the analysis of a dataset using the techniques learned.
It is required the preparation of a progress report and a final report.
In the last class of the term students have to present their work.
10% of the course final mark is relative to the progress report and 40% to the final report and presentation.

Special assessment (TE, DA, ...)

The students dismissed from the practical classes must do the practical assignment and the final exam.

Classification improvement

The distributed classification improvement can only be done in the following year.
Recommend this page Top
Copyright 1996-2025 © Faculdade de Engenharia da Universidade do Porto  I Terms and Conditions  I Accessibility  I Index A-Z  I Guest Book
Page generated on: 2025-06-16 at 20:27:09 | Acceptable Use Policy | Data Protection Policy | Complaint Portal