Go to:
Logótipo
You are in:: Start > CC4061

Advanced Topics in Data Science

Code: CC4061     Acronym: CC4061     Level: 400

Keywords
Classification Keyword
OFICIAL Computer Science

Instance: 2025/2026 - 2S Ícone do Moodle

Active? Yes
Responsible unit: Department of Computer Science
Course/CS Responsible: Master in Bioinformatics and Computational Biology

Cycles of Study/Courses

Acronym No. of Students Study Plan Curricular Years Credits UCN Credits ECTS Contact hours Total Time
E:BBC 0 PE_Bioinformatics and Computational Biology 1 - 6 42 162
M:A_ASTR 2 Study plan since academic year 2024/2025 1 - 6 42 162
2
M:BBC 3 Official study plan since 2025/2026 1 - 6 42 162
M:CC 13 Study plan since academic year 2025/2026 1 - 6 42 162
M:CTN 0 Study plan since academic year 2025/2026 1 - 6 42 162
M:DS 17 Study plan since academic year 2025/2026 1 - 6 42 162
M:ECAD 3 Study plan since 2021/2022 2 - 6 42 162
M:ENM 0 Official Study Plan since 2023/2024 1 - 6 42 162
2
M:ENSI 4 Official study plan since 2025/2026 1 - 6 42 162
M:M 0 Official study plan since 2024/2025 1 - 6 42 162
2

Teaching Staff - Responsibilities

Teacher Responsibility
Alípio Mário Guedes Jorge

Teaching - Hours

Theoretical and practical : 3,23
Type Teacher Classes Hour
Theoretical and practical Totals 2 6,462
Alípio Mário Guedes Jorge 3,231

Teaching language

Portuguese and english
Obs.: All course material are provided in English

Objectives

Identification and application of data science techniques for knowledge extraction from diverse data sources with a focus on NLP and Information Retrieval. We will see how to handle and explore text (natural language processing), interaction data (recommendation systems and association rules), sequences (sequence mining), and networks in a web and social media context (link analysis). We will also handle outlier detection and its application in this context.

Learning outcomes and competences

At the end of the course, the student should be able to:
- recognize different problems solvable through the use of mentioned techniques;
- identify and specify tasks similar to those discussed;
- obtain and pre-process data for the algorithms and tasks addressed;
- understand and use the algorithms;
- obtain, interpret, evaluate and use models;
- Implement some of the algorithms and propose changes to improve them.
 

Working method

Presencial

Pre-requirements (prior knowledge) and co-requirements (common knowledge)

The student should be familiar with the basic concepts of data science and computational learning and have knowledge of programming languages used in data mining tasks, such as the Python language.

Program

1. Natural Language Processing:
• Text representation
• Preprocessing
• NLP tasks
• Classical and deep learning approaches
• NLP applications

2. Web:
• Information retrieval
• Recommendation systems: collaborative filtering, matrix factorization, and deep learning approaches
• Link analysis

3. Frequent pattern extraction:
• Frequent itemsets and association rules
• Apriori and FP-Growth algorithms
• Itemset summarization and rule selection
•  Deep learning approaches

4. Rare value discovery:
• Challenges
• Unsupervised techniques
• Semi-supervised techniques
• Applications in NLP, IR, and web

Mandatory literature

Daniel Jurafsky & James H. Martin; Speech and Language Processing, Prentice Hall / Pearson, 2025 (https://web.stanford.edu/~jurafsky/slp3/ (3rd edition))
Emrul Hasan, Mizanur Rahman, Chen Ding, Jimmy Xiangji Huang, and Shaina Raza; Review-based Recommender Systems: A Survey of Approaches, Challenges and Future Perspectives, ACM, 2025 (https://doi.org/10.1145/3742421)
Petru Kallay, Tudor Dan Mihoc ; Comparative Analysis of Frequent Pattern Mining Algorithms, 2025 (https://link.springer.com/article/10.1007/s44427-025-00008-1)

Teaching methods and learning activities

Theoretical-practical classes where the topics covered in the program will be exposed and some practical examples of application will be provided. Solving exercises in the practical part and carrying out group work with final presentation and discussion of the results.
 

Software

R
RStudio
Python
Jupyter lab

Evaluation Type

Distributed evaluation with final exam

Assessment Components

designation Weight (%)
Trabalho prático ou de projeto 40,00
Exame 50,00
Teste 10,00
Total: 100,00

Amount of time allocated to each course unit

designation Time (hours)
Elaboração de projeto 35,00
Estudo autónomo 84,00
Apresentação/discussão de um trabalho científico 1,00
Frequência das aulas 42,00
Total: 162,00

Eligibility for exams

Practical work is mandatory for all scheduled assignments.

At least 70% attendance is required for both theoretical and practical laboratory classes.

Calculation formula of final grade

The course assessment is distributed, consisting of a test, a final exam, and a practical assignment.

The combined grade is calculated by weighting the practical and theoretical grades using the formula:

NComb = 0.50 * NE + 0.1 * NT + 0.40 * NTP

where,

NE is the grade obtained in the exam and NTP is the grade of the practical assignment.

The final grade (NF) is limited to 30% above the individual grade (test plus exam).

NF = min(1,3*NInd,NComb)

If the exam grade is higher than the test grade, or if the student did not take the test for justified reasons, the exam will have a weight of 60% and the test will not be considered.

Students who do not obtain a minimum of 30% in each component (except the test) will not pass.

The resit exam will be graded for 60% (12 out of 20) of the final grade or in a combined grade with the test in the same proportions as in the regular exam period.

Examinations or Special Assignments

The practical assignment will be announced in the middle of the semester and should be completed and presented by the end of the semester.

 

Special assessment (TE, DA, ...)

The student can improve only the theoretical grade by taking the supplementary exam.
The requirement for minimum attendance in classes does not apply.

Classification improvement

The evaluation of the practical assignment is not subject to improvement.

The student can improve the theoretical grade by taking the supplementary exam.


 

Observations

All the provided material (slides, recommended books, assignments and exames, etc.) is in the English language.
Recommend this page Top
Copyright 1996-2025 © Faculdade de Ciências da Universidade do Porto  I Terms and Conditions  I Acessibility  I Index A-Z
Page created on: 2025-11-19 at 07:04:15 | Privacy Policy | Personal Data Protection Policy | Whistleblowing | Electronic Yellow Book