Data Mining II
Keywords |
Classification |
Keyword |
OFICIAL |
Computer Science |
Instance: 2023/2024 - 2S ![Requerida a integração com o Moodle Ícone do Moodle](/fcup/pt/imagens/MoodleIcon)
Cycles of Study/Courses
Teaching language
Portuguese and english
Obs.: All course material are provided in English
Objectives
Identification and application of data mining techniques for knowledge extraction from various data sources. The focus will be on association rules, sequence mining, recommendation systems, link analysis, information retrieval, and text mining.
Learning outcomes and competences
At the end of the course, the student should be able to:
- recognize different problems solvable through the use of data mining techniques discussed and detailed in the content;
- identify and specify data mining tasks similar to those discussed;
- obtain and pre-process data for the algorithms and tasks addressed;
- understand and use data mining algorithms;
- obtain, interpret, evaluate and use data mining models;
- Implement some of the algorithms and propose changes to improve them.
Working method
Presencial
Pre-requirements (prior knowledge) and co-requirements (common knowledge)
Students should be familiar with the basic concepts of data mining and have knowledge of programming languages used in data mining tasks, such as the R or Python language.
Program
1.
Association Pattern Mining
• frequent itemsets and association rules
• Apriori algorithm
• itemsets summarization and rules selection
• FP-Growth algorithm
2. Web Mining
• recommender systems
• link analysis
• information retrieval
3. Information Retrieval
• pre-processing
• retrieval models
• retrieval evaluation
4. Text Mining
• document representation in vector spaces
• document clustering
• document classification
• sentiment and emotion analysis
5. Outlier Mining
• challenges
• unsupervised techniques
• semi-supervised techniques
• supervised techniques
Mandatory literature
Liu Bing 1963-;
Web data mining. ISBN: 978-3-642-19459-7
Hand David 1950-;
Principles of data mining. ISBN: 978-0-262-08290-7
Complementary Bibliography
Charu C. Aggarwal; Data Mining - The Texbook, Springer, 2015. ISBN: 978-3-319-14141-1
Teaching methods and learning activities
Theoretical-practical classes where the topics covered in the program will be exposed and some practical examples of application will be provided. Solving exercises in the practical part and carrying out group work with final presentation and discussion of the results.
Software
R
RStudio
Evaluation Type
Distributed evaluation with final exam
Assessment Components
designation |
Weight (%) |
Trabalho prático ou de projeto |
40,00 |
Exame |
60,00 |
Total: |
100,00 |
Amount of time allocated to each course unit
designation |
Time (hours) |
Elaboração de projeto |
35,00 |
Estudo autónomo |
84,00 |
Apresentação/discussão de um trabalho científico |
1,00 |
Frequência das aulas |
42,00 |
Total: |
162,00 |
Eligibility for exams
The practical assignment is
mandatory with a minimum grade of 30%.
At least 70% attendance in theoretical classes and practical laboratory sessions.
Nota mínima de 30% no exame final.
Calculation formula of final grade
The assessment of the course is distributed, consisting of a final exam and a practical assignment at the end of the semester.
The final grade is calculated by the weighted average of the practical and theoretical grades through the formula:
FG = 0.60 * FE + 0.40 * PA
where,
FE is the grade of the final exam and
PA is the grade of the practical assignment.
Students who do not obtain a minimum of 30% in each component will not be approved.
The supplementary exam will be quoted to 60% (12 out of 20) of the final grade.
Examinations or Special Assignments
The practical assignment will be announced in the middle of the semester and should be completed and presented by the end of the semester.
Special assessment (TE, DA, ...)
The student can improve only the theoretical grade by taking the supplementary exam.
The requirement for minimum attendance in classes does not apply.
Classification improvement
The evaluation of the practical assignment is not subject to improvement.
The student can improve the theoretical grade by taking the supplementary exam.
Observations
All the provided material (slides, recommended books, etc.) is in the English language.