Fraud Detection
Keywords |
Classification |
Keyword |
OFICIAL |
Computer Science |
Instance: 2021/2022 - 2S 
Cycles of Study/Courses
Teaching language
Suitable for English-speaking students
Objectives
The objectives of this course are the study of data analysis methodologies that are useful in the context of the detection/forecasting of fraudulent cases. With the growing use of data collection methods in practically all human activities, the need for the use of techniques allowing the automatic analysis of such data with the objective of detection/predicting situations that could be considered anomalous or potentially fraudulent is increasing.
Learning outcomes and competences
It is intended that the students:
- Acquire theoretical knowledge on data analysis methodologies that are useful for the detection and prediction of fraud/anomalies;
- Acquire practical experiência in developing and using software for the detection and prediction of fraud/anomalies;
- Acquire expertise in fraud detection through the analysis of practical case studies on this type of problems.
Working method
Presencial
Program
Week 1
- Presentation
- Introduction to Data Mining
- Introduction to R
- Basic Concepts in R (1/2)
Week 2
- Basic Concepts in R (2/2)
- Reporting in R
Week 3
- Data Import in R
- Data Pre-Processing
- Data Summarization
Week 4
- Data Visualization
Week 5
- Introduction to Predictive Modelling
- Evaluation Metrics
- Tree-Based Models
- Naïve Bayes
Week 6
- k-Nearest Neighbours
- Support Vector Machines
- Clustering
Week 7
- Ensemble Learning
Week 8
- Evaluation Methodologies
- Performance Estimation
Week 9
Spring Break
Week 10
1st Test
Week 11
- Outlier Detection
Week 12
- Handling Big Data
Week 13
Student Week
Week 14
- Handling Imbalanced Domains (1/3)
Week 15
- Handling Imbalanced Domains (2/3)
Week 16
- Handling Imbalanced Domains (3/3)
Mandatory literature
Barnett Vic; Outliers in statistical data. ISBN: 0-471-99599-1
Torgo Luís; Data Mining with R. ISBN: 9781439810187 hbk
Complementary Bibliography
Han,J.; Kamber,M and Pei,J.; Data Mining: concepts and techniques (3rd edition)
Teaching methods and learning activities
Classes will combine theory and practice, with exposition of theory complemented with practical exercices on the computer.
Software
R statistical software
Evaluation Type
Distributed evaluation without final exam
Assessment Components
designation |
Weight (%) |
Teste |
40,00 |
Trabalho prático ou de projeto |
60,00 |
Total: |
100,00 |
Amount of time allocated to each course unit
designation |
Time (hours) |
Elaboração de projeto |
0,00 |
Estudo autónomo |
0,00 |
Frequência das aulas |
0,00 |
Total: |
0,00 |
Eligibility for exams
It is required that you obtain a minimum score of 7 in the theoretical test.
Calculation formula of final grade
Final classification is given by the following formula:
NF = 0.4 * NT + 0.6 * NP
where, NT is the grade of the theoretical test, and NP is given by the weighted average of the two practical assignments as: NP = 0.4*TP1 + 0.6*TP2, where TP1 is the grade of the first practical assignment and TP2 is the grade of the second.
Observations
Evaluation Committee:
- Nuno Moniz
- Rita P. Ribeiro