Go to:
Logótipo
You are here: Start > EIC0096

Knowledge Extraction and Machine Learning

Code: EIC0096     Acronym: ECAC

Keywords
Classification Keyword
OFICIAL Artificial Intelligence

Instance: 2020/2021 - 1S Ícone do Moodle

Active? Yes
Responsible unit: Department of Informatics Engineering
Course/CS Responsible: Master in Informatics and Computing Engineering

Cycles of Study/Courses

Acronym No. of Students Study Plan Curricular Years Credits UCN Credits ECTS Contact hours Total Time
MIEIC 65 Syllabus since 2009/2010 5 - 6 42 162
Mais informaçõesLast updated on 2020-09-21.

Fields changed: Teaching methods and learning activities, Melhoria de classificação, Fórmula de cálculo da classificação final

Teaching language

Suitable for English-speaking students

Objectives

With the increasing digitization of their processes, organizations (companies, government, etc.) now feel the need to extract knowledge from this data to improve the efficiency and effectiveness of these processes (eg to gain competitive advantage). To this end, organizations need to acquire technical skills to develop solutions based on standard approaches to Machine Learning (ML) and Data Mining (DM), but also scientific skills for developing innovative solutions to problems where these standard approaches do not exist.

Thus, the goals of this course are:


  • Motivate for the use of ML / DM techniques in decision support.

  • Develop the ability to properly utilize these techniques for automatic analysis of large amounts of data.

  • Develop the ability to undertake scientific research to develop new approaches to ML / DM.


 

Percentage Distribution


  • Scientific component: 70%

  • Technological component: 30%

Learning outcomes and competences

Students should be able to

  • Understand the different types of Data Mining (DM) tasks.
  • Identify decision support problems that can be represented as DM tasks.
  • Be aware of the main methods / algorithms for the most common DM tasks and understand the basics of their operation.
  • Apply these methods correctly when conducting a DM project, following a proper methodology.
  • Appropriately evaluate the results of an DM project.
  • Identify opportunities for developing new approaches to ML / DM.
  • Develop simple but appropriate scientific work to create new approaches to ML / DM.

Working method

Presencial

Pre-requirements (prior knowledge) and co-requirements (common knowledge)

Although no particular course is required, it is useful to have basic knowledge of:


  • statistics

  • algorithms

  • artificial intelligence and machine learning.

Program


  • Introduction to Machine Learning and Data Mining.

  • DM Projects: DM methodologies and data preparation.

  • Classification: introduction, evaluation (measures and methodologies) and algorithms (rule-, distance- and kernel-based methods; Bayesian methods). Scoring with classification models: approach and evaluation. Common classification issues (unbalanced class distribution and costs).

  • Regression: introduction, evaluation (measures; compromise between bias and variance) and algorithms.

  • Clustering: Partition (revision of K-means, K -medoids), density and hierarchical algorithms. Evaluation measures.

  • Frequent Pattern Discovery: Frequent item set algorithms (APRIORI, Eclat, FP-Growth) and association rules. Evaluation measures (Support, trust, lift, ...). Other types of patterns: sequences and graphs.

  • Recommendation systems: introduction, evaluation (measures and methodologies) and algorithms (content based, collaborative filtering, specialized systems).

  • Ensemble learning: methods (Bagging, Random Forests, AdaBoost, Negative Correlation Learning). Characteristics of a good ensemble.

  • Incremental learning: introduction, evaluation (measures and methodologies) and algorithms (very fast decision trees, incremental clustering algorithms).

  • Automated machine learning (autoML) and meta learning.

Mandatory literature

João Moreira, Andre Carvalho, Tomás Horvath; Data Analytics: A General Introduction, Wiley, 2018. ISBN: 978-1-119-29626-3 (https://www.wiley.com/en-aw/A+General+Introduction+to+Data+Analytics-p-9781119296263)

Complementary Bibliography

Ian H. Witten, Eibe Frank; Data mining. ISBN: 1-55860-552-5
Peter Flach; Machine Learning: The Art and Science of Algorithms that Make Sense of Data, Cambridge University Press, 2012. ISBN: 9781107422223 (http://www.cs.bris.ac.uk/~flach/mlbook/)
Mohammed Zaki and Wagner Meira Jr.; Data Mining and Analysis: Fundamental Concepts and Algorithms, Cambridge University Press, 2013. ISBN: 9780521766333 (http://www.dcc.ufmg.br/miningalgorithms/DokuWiki/doku.php)
Jiawei Han, Micheline Kamber; Data mining. ISBN: 1-55860-489-8
Max Kuhn, Kjell Johnson; Applied Predictive Modeling, Springer New York, 2013. ISBN: 9781461468493
Charu C. Aggarwal; Data mining. ISBN: 978-3-319-14142-8

Teaching methods and learning activities


  • Theoretical classes and individual study for exposition of concepts.

  • Laboratory sessions and data mining project for practical application and consolidation of learned concepts.

  • Research project and writing of scientific article for development of research skills.

Software

Python
Rapid Miner
The R Project for Statistical Computing

Evaluation Type

Distributed evaluation without final exam

Assessment Components

Designation Weight (%)
Exame 35,00
Participação presencial 0,00
Trabalho laboratorial 65,00
Total: 100,00

Amount of time allocated to each course unit

Designation Time (hours)
Estudo autónomo 60,00
Frequência das aulas 42,00
Trabalho laboratorial 60,00
Total: 162,00

Eligibility for exams

The distributed assessment consists of:

  • ECD project,
  • mini test, and
  • scientific project, including the writing of a scientific paper.

In case of missing one of the moments of the distributed evaluation, the respective grade is 0 (zero) values.

Students with worker or equivalent status,  who are exempt from class attendance should, at regular intervals to be defined with the teachers, present the progress of their work, as well as do the scheduled presentations together with the regular students.

Calculation formula of final grade

0,3 * DM project + 0,35 * mini-test + 0,35 * research project

Minimum grade in each componente: 7,0 (out of 20)

Examinations or Special Assignments

The DM project will be developed in groups of 2 students and consists of analyzing a data set and preparing a presentation that describes and discusses both the project and the results obtained.

The scientific project will be elaborated individually or in groups with up to 3 students.

Special assessment (TE, DA, ...)

Students who are exempt from class attendance must complete all assessment components and should contact the teacher to make any necessary adjustments to the process.

Classification improvement

Grade improvement may be done for the mini-test and the scientific project in the special season (recurso) of the year in which the student is approved.

For components which no grade improvement has been done in the year in which the student is approved, improvement may be made in one or more of the components in the following year, during the regular or special season.
Recommend this page Top
Copyright 1996-2024 © Faculdade de Engenharia da Universidade do Porto  I Terms and Conditions  I Accessibility  I Index A-Z  I Guest Book
Page generated on: 2024-10-06 at 17:21:42 | Acceptable Use Policy | Data Protection Policy | Complaint Portal