Go to:
Logótipo
Você está em: Start » Publications » View » When Two are Better Than One: Synthesizing Heavily Unbalanced Data
Publication

When Two are Better Than One: Synthesizing Heavily Unbalanced Data

Title
When Two are Better Than One: Synthesizing Heavily Unbalanced Data
Type
Article in International Scientific Journal
Year
2021
Authors
Ferreira, F
(Author)
Other
The person does not belong to the institution. The person does not belong to the institution. The person does not belong to the institution. Without AUTHENTICUS Without ORCID
Lourenco, N
(Author)
Other
The person does not belong to the institution. The person does not belong to the institution. The person does not belong to the institution. Without AUTHENTICUS Without ORCID
Cabral, B
(Author)
Other
The person does not belong to the institution. The person does not belong to the institution. The person does not belong to the institution. Without AUTHENTICUS Without ORCID
Joao Paulo Fernandes
(Author)
FEUP
View Personal Page You do not have permissions to view the institutional email. Search for Participant Publications View Authenticus page View ORCID page
Journal
Title: IEEE AccessImported from Authenticus Search for Journal Publications
Vol. 9
ISSN: 2169-3536
Publisher: IEEE
Indexing
Publicação em ISI Web of Knowledge ISI Web of Knowledge - 0 Citations
Publicação em Scopus Scopus - 0 Citations
Other information
Authenticus ID: P-00W-518
Abstract (EN): Nowadays, data is king and if treated and used properly it promises to give organizations a competitive edge over rivals by enabling them to develop and design Intelligent Systems to improve their services. However, they need to fully comply with not only ethical but also regulatory obligations, where, e.g., privacy (strictly) needs to be respected when using or sharing data, thus protecting both the interests of users and organizations. Fraud Detection systems are examples of such systems where Machine Learning algorithms leverage information to classify financial transactions as legitimate or illicit. The data used to create these solutions is usually highly structured and contains categorical and continuous features characterised by complex distributions. One of the main challenges of fraud detection is concerned with the scarcity of fraudulent instances which results in highly unbalanced datasets. Additionally, privacy is crucial, and it is usually forbidden, or not possible, to share the data of organizations and individuals for creating or improving models.In this paper we propose a framework for private data sharing based on synthetic data generation using Generative Adversarial Networks (GAN) that learns the specificities of financial transactions data and generates fictitious data that keeps the utility of the original datasets. Our proposal, called Duo-GAN, uses two GAN generators to handle the data imbalance problem, one generator for fraudulent instances and the other for legitimate instances. With this approach, we observed, at most, a 5% disparity in F1 scores between classifiers trained and tested with actual data and the ones trained with synthetic data and tested with actual data.
Language: English
Type (Professor's evaluation): Scientific
No. of pages: 11
Documents
We could not find any documents associated to the publication.
Related Publications

Of the same journal

Key Indicators to Assess the Performance of LiDAR-Based Perception Algorithms: A Literature Review (2023)
Another Publication in an International Scientific Journal
José Machado da Silva; K. Chiranjeevi; Correia, M. V.
IEEE ACCESS SPECIAL SECTION EDITORIAL: SOFT COMPUTING TECHNIQUES FOR IMAGE ANALYSIS IN THE MEDICAL INDUSTRY - CURRENT TRENDS, CHALLENGES AND SOLUTIONS (2018)
Another Publication in an International Scientific Journal
D. Jude Hemanth; Lipo Wang; João Manuel R. S. Tavares; Fuqian Shi; Vania Vieira Estrela
From a Visual Scene to a Virtual Representation: A Cross-Domain Review (2023)
Another Publication in an International Scientific Journal
Pereira, A; Pedro Carvalho; Pereira, N; Viana, P; Luís Corte-Real
Visual Trunk Detection Using Transfer Learning and a Deep Learning-Based Coprocessor (2020)
Article in International Scientific Journal
Aguiar, AS; Filipe Neves Santos; Armando Jorge Sousa; Oliveira, PM; Santos, LC
Understanding Overlap in Automatic Root Cause Analysis in Manufacturing Using Causal Inference (2022)
Article in International Scientific Journal
Eduardo E. Oliveira; Vera L. Miguéis; José Luís Moura Borges

See all (76)

Recommend this page Top
Copyright 1996-2024 © Faculdade de Medicina da Universidade do Porto  I Terms and Conditions  I Acessibility  I Index A-Z  I Guest Book
Page created on: 2024-08-27 at 01:11:55
Acceptable Use Policy | Data Protection Policy | Complaint Portal | Política de Captação e Difusão da Imagem Pessoal em Suporte Digital