Go to:
Logótipo
You are in:: Start > CC4036

Fraud Detection

Code: CC4036     Acronym: CC4036     Level: 400

Keywords
Classification Keyword
OFICIAL Computer Science

Instance: 2022/2023 - 2S Ícone do Moodle

Active? Yes
Responsible unit: Department of Computer Science
Course/CS Responsible: Master in Information Security

Cycles of Study/Courses

Acronym No. of Students Study Plan Curricular Years Credits UCN Credits ECTS Contact hours Total Time
M:SI 20 Study plan since 2020/2021 1 - 6 42 162
Mais informaçõesLast updated on 2023-04-26.

Fields changed: Program, Fórmula de cálculo da classificação final

Teaching language

Suitable for English-speaking students

Objectives

The objectives of this course are the study of data analysis methodologies that are useful in the context of the detection/forecasting of fraudulent cases. With the growing use of data collection methods in practically all human activities, the need for the use of techniques allowing the automatic analysis of such data with the objective of detection/predicting situations that could be considered anomalous or potentially fraudulent is increasing.

Learning outcomes and competences

It is intended that the students:


  1. Acquire theoretical knowledge of data analysis methodologies that are useful for the detection and prediction of fraud/anomalies;


  2. Acquire practical experience in developing and using software for the detection and prediction of fraud/anomalies;


  3. Acquire expertise in fraud detection by analysing practical case studies on this type of problem.

Working method

Presencial

Program

1) Introduction to Data Mining
- Data Mining applications and CRISP-DM methodology.
- A brief introduction to R programming language, data import and basic manipulation.

2) Data Understanding
- Data Summarization
- Data Visualization

3) Data Preparation
- Data Quality Issues
- Data Pre-processing

4) Unsupervised Learning
- Descriptive Analytics
- Clustering Algorithms and Validation Methods

5) Supervised Learning
- Classification and Regression problems.
- Binary and multiclass classification.
- Evaluation metrics
- Algorithms: k-NN, Naive Bayes, Linear Regression, Ridge and Lasso Regression, CART, SVMs, ANNs.

6) Ensembles
- Motivation and Types of Ensembles
- Algorithms: Bagging, Random Forest, Boosting, AdaBoost, XGBoost.

7) Evaluation Methodologies
- Performance estimation and experimental methodologies.
- Comparison of Models: statistical significance, paired comparisons on single and multiple tasks.


8) Imbalanced Domain Learning and Anomaly Detection
- Challenges
- Approaches 
- Open Research Questions

Mandatory literature

Barnett Vic; Outliers in statistical data. ISBN: 0-471-99599-1
Torgo Luís; Data Mining with R. ISBN: 9781439810187 hbk

Complementary Bibliography

Han,J.; Kamber,M and Pei,J.; Data Mining: concepts and techniques (3rd edition)

Teaching methods and learning activities

Classes will combine theory and practice, with exposition of theory complemented with practical exercices on the computer.

Software

R statistical software

Evaluation Type

Distributed evaluation without final exam

Assessment Components

designation Weight (%)
Teste 40,00
Trabalho prático ou de projeto 60,00
Total: 100,00

Amount of time allocated to each course unit

designation Time (hours)
Elaboração de projeto 0,00
Estudo autónomo 0,00
Frequência das aulas 0,00
Total: 0,00

Eligibility for exams

It is required that you obtain a minimum score of 7 in the theoretical test.

Calculation formula of final grade

The following formula gives the final classification:

NF = 0.4 * NT + 0.6 * NP

where NT is the grade of the theoretical test, and NP is given by the individual project grade.

Internship work/project

The practical assignment is individual and consists of the development and presentation of a project aimed at detecting fraud on a set of real data
Recommend this page Top
Copyright 1996-2025 © Faculdade de Ciências da Universidade do Porto  I Terms and Conditions  I Acessibility  I Index A-Z  I Guest Book
Page created on: 2025-06-16 at 02:59:35 | Acceptable Use Policy | Data Protection Policy | Complaint Portal