Go to:
Logótipo
Comuta visibilidade da coluna esquerda
Você está em: Start > Publications > View > Online boxplot derived outlier detection
Publication

Online boxplot derived outlier detection

Title
Online boxplot derived outlier detection
Type
Article in International Scientific Journal
Year
2024
Authors
Mazarei, A
(Author)
Other
The person does not belong to the institution. The person does not belong to the institution. The person does not belong to the institution. Without AUTHENTICUS Without ORCID
Sousa, R
(Author)
Other
The person does not belong to the institution. The person does not belong to the institution. The person does not belong to the institution. Without AUTHENTICUS Without ORCID
João Mendes-Moreira
(Author)
FEUP
View Personal Page You do not have permissions to view the institutional email. Search for Participant Publications View Authenticus page Without ORCID
Molchanov, S
(Author)
Other
The person does not belong to the institution. The person does not belong to the institution. The person does not belong to the institution. Without AUTHENTICUS Without ORCID
Ferreira, HM
(Author)
Other
The person does not belong to the institution. The person does not belong to the institution. The person does not belong to the institution. Without AUTHENTICUS Without ORCID
Journal
Indexing
Publicação em ISI Web of Knowledge ISI Web of Knowledge - 0 Citations
Publicação em Scopus Scopus - 0 Citations
Other information
Authenticus ID: P-010-EPB
Abstract (EN): Outlier detection is a widely used technique for identifying anomalous or exceptional events across various contexts. It has proven to be valuable in applications like fault detection, fraud detection, and real-time monitoring systems. Detecting outliers in real time is crucial in several industries, such as financial fraud detection and quality control in manufacturing processes. In the context of big data, the amount of data generated is enormous, and traditional batch mode methods are not practical since the entire dataset is not available. The limited computational resources further compound this issue. Boxplot is a widely used batch mode algorithm for outlier detection that involves several derivations. However, the lack of an incremental closed form for statistical calculations during boxplot construction poses considerable challenges for its application within the realm of big data. We propose an incremental/online version of the boxplot algorithm to address these challenges. Our proposed algorithm is based on an approximation approach that involves numerical integration of the histogram and calculation of the cumulative distribution function. This approach is independent of the dataset's distribution, making it effective for all types of distributions, whether skewed or not. To assess the efficacy of the proposed algorithm, we conducted tests using simulated datasets featuring varying degrees of skewness. Additionally, we applied the algorithm to a real-world dataset concerning software fault detection, which posed a considerable challenge. The experimental results underscored the robust performance of our proposed algorithm, highlighting its efficacy comparable to batch mode methods that access the entire dataset. Our online boxplot method, leveraging dataset distribution to define whiskers, consistently achieved exceptional outlier detection results. Notably, our algorithm demonstrated computational efficiency, maintaining constant memory usage with minimal hyperparameter tuning.
Language: English
Type (Professor's evaluation): Scientific
No. of pages: 15
Documents
We could not find any documents associated to the publication.
Related Publications

Of the same journal

Using network features for credit scoring in microfinance (2021)
Article in International Scientific Journal
Paraiso, P; Ruiz, S; Gomes, P; Rodrigues, L; João Gama
Using network features for credit scoring in microfinance (2021)
Article in International Scientific Journal
Paraíso, P; Ruiz, S; Gomes, P; Rodrigues, L; João Gama
Resampling strategies for imbalanced time series forecasting (2017)
Article in International Scientific Journal
Moniz, N; Branco, P; Torgo, L
Personalised medicine challenges: quality of data (2018)
Article in International Scientific Journal
Ricardo Cruz Correia; Ferreira, D; Bacelar, G; Marques, P; Maranhão, P
Personalised medicine challenges: quality of data (2018)
Article in International Scientific Journal
Ricardo Cruz Correia; Ferreira, D; Bacelar Silva, GM; Vieira Marques, PM; Maranhão, PA

See all (11)

Recommend this page Top
Copyright 1996-2024 © Reitoria da Universidade do Porto  I Terms and Conditions  I Acessibility  I Index A-Z  I Guest Book
Page created on: 2024-11-09 05:11:32 | Acceptable Use Policy | Data Protection Policy | Complaint Portal