Go to:
Logótipo
Você está em: Start > Publications > View > Resource-bounded outlier detection using clustering methods
Map of Premises
Principal
Publication

Resource-bounded outlier detection using clustering methods

Title
Resource-bounded outlier detection using clustering methods
Type
Article in International Scientific Journal
Year
2010
Authors
Torgo, L
(Author)
FCUP
View Personal Page You do not have permissions to view the institutional email. Search for Participant Publications View Authenticus page Without ORCID
Soares, C
(Author)
FEP
View Personal Page You do not have permissions to view the institutional email. Search for Participant Publications View Authenticus page View ORCID page
Journal
Vol. 218
Pages: 84-98
ISSN: 0922-6389
Publisher: IOS PRESS
Indexing
Publicação em ISI Web of Knowledge ISI Web of Knowledge
Other information
Authenticus ID: P-007-W1T
Abstract (EN): This paper describes a methodology for the application of hierarchical clustering methods to the task of outlier detection. The methodology is tested on the problem of cleaning Official Statistics data. The goal is to detect erroneous foreign trade transactions in data collected by the Portuguese Institute of Statistics (INE). These transactions are a minority, but still they have an important impact on the statistics produced by the institute. The detectiong of these rare errors is a manual, time-consuming task. This type of tasks is usually constrained by a limited amount of available resources. Our proposal addresses this issue by producing a ranking of outlyingness that allows a better management of the available resources by allocating them to the cases which are most different from the other and, thus, have a higher probability of being errors. Our method is based on the output of standard agglomerative hierarchical clustering algorithms, resulting in no significant additional computational costs. Our results show that it enables large savings by selecting a small subset of suspicious transactions for manual inspection, which, nevertheless, includes most of the erroneous transactions. In this study we compare our proposal to a state of the art outlier ranking method (LOF) and show that our method achieves better results on this particular application. The results of our experiments are also competitive with previous results on the same data. Finally, the outcome of our experiments raises important questions concerning the method currently followed at INE concerning items with small number of transactions.
Language: English
Type (Professor's evaluation): Scientific
No. of pages: 15
Documents
We could not find any documents associated to the publication.
Related Publications

Of the same authors

Model Selection for Time Series Forecasting An Empirical Analysis of Multiple Estimators (2023)
Article in International Scientific Journal
Cerqueira, V; Torgo, L; Carlos Soares
Early anomaly detection in time series: a hierarchical approach for predicting critical health episodes (2023)
Article in International Scientific Journal
Cerqueira, V; Torgo, L; Carlos Soares
Dynamic discretization of continuous attributes (1998)
Article in International Scientific Journal
Gama, J; Torgo, L; Soares, C
Arbitrage of forecasting experts (2019)
Article in International Scientific Journal
Vitor Cerqueira; Luís Torgo; Fábio Pinto; Carlos Soares
A case study comparing machine learning with statistical methods for time series forecasting: size matters (2022)
Article in International Scientific Journal
Cerqueira, V; Torgo, L; Carlos Soares

See all (14)

Of the same journal

Preface (2008)
Another Publication in an International Scientific Journal
Soares, C; Peng, Y; Meng, J; Washio, T; Zhou, ZH
Frontiers in Artificial Intelligence and Applications: Preface (2010)
Another Publication in an International Scientific Journal
Soares, C; Ghani, R
Applications of Data Mining in E-Business Finance: Introduction (2007)
Chapter or Part of a Book
Soares, C; Peng, Y; Meng, J; Washio, T; Zhou, ZH
Robust Division in Clustering of Streaming Time Series (2008)
Article in International Scientific Journal
Pedro Pereira Rodrigues; Joao Gama
Learning from Data Streams: Synopsis and Change Detection (2008)
Article in International Scientific Journal
Raquel Sebastiao; Joao Gama; Teresa Mendonca

See all (7)

Recommend this page Top
Copyright 1996-2025 © Faculdade de Medicina Dentária da Universidade do Porto  I Terms and Conditions  I Acessibility  I Index A-Z
Page created on: 2025-07-18 at 21:15:56 | Privacy Policy | Personal Data Protection Policy | Whistleblowing | Electronic Yellow Book