Go to:
Logótipo
Comuta visibilidade da coluna esquerda
Você está em: Start > Publications > View > D-Confidence: An active learning strategy to reduce label disclosure complexity in the presence of imbalanced class distributions
Publication

D-Confidence: An active learning strategy to reduce label disclosure complexity in the presence of imbalanced class distributions

Title
D-Confidence: An active learning strategy to reduce label disclosure complexity in the presence of imbalanced class distributions
Type
Article in International Scientific Journal
Year
2012
Authors
Escudeiro, NF
(Author)
Other
The person does not belong to the institution. The person does not belong to the institution. The person does not belong to the institution. View Authenticus page Without ORCID
Jorge, AM
(Author)
FCUP
View Personal Page You do not have permissions to view the institutional email. Search for Participant Publications View Authenticus page Without ORCID
Journal
Vol. 18
Pages: 311-330
ISSN: 0104-6500
Publisher: Springer Nature
Indexing
Other information
Authenticus ID: P-008-6M3
Abstract (EN): In some classification tasks, such as those related to the automatic building and maintenance of text corpora, it is expensive to obtain labeled instances to train a classifier. In such circumstances it is common to have massive corpora where a few instances are labeled (typically a minority) while others are not. Semi-supervised learning techniques try to leverage the intrinsic information in unlabeled instances to improve classification models. However, these techniques assume that the labeled instances cover all the classes to learn which might not be the case. Moreover, when in the presence of an imbalanced class distribution, getting labeled instances from minority classes might be very costly, requiring extensive labeling, if queries are randomly selected. Active learning allows asking an oracle to label new instances, which are selected by criteria, aiming to reduce the labeling effort. D-Confidence is an active learning approach that is effective when in presence of imbalanced training sets. In this paper we evaluate the performance of d-Confidence in comparison to its baseline criteria over tabular and text datasets. We provide empirical evidence that d-Confidence reduces label disclosure complexity-which we have defined as the number of queries required to identify instances from all classes to learn-when in the presence of imbalanced data. © 2012 The Brazilian Computer Society.
Language: English
Type (Professor's evaluation): Scientific
Documents
We could not find any documents associated to the publication.
Related Publications

Of the same journal

Ranking MEDLINE documents (2014)
Article in International Scientific Journal
Célia Valente; Rui Camacho; Eugénio Oliveira
Gene clusters as intersections of powers of paths (2012)
Article in International Scientific Journal
Costa, VS; Dantas, S; Sankoff, D; Xu, X
Forgetting mechanisms for scalable collaborative filtering (2012)
Article in International Scientific Journal
Vinagre, J; Jorge, AM
A set of novel modifications to improve algorithms from the A* family applied in mobile robotics (2013)
Article in International Scientific Journal
Tiago Nascimento; Pedro Gomes Da Costa; Paulo Gomes Da Costa; António Paulo Moreira; André Conceição
A data warehouse to support web site automation (2014)
Article in International Scientific Journal
Marcos Aurélio Domingues; Carlos Soares; Alípio Mário Jorge; Solange Oliveira Rezende
Recommend this page Top
Copyright 1996-2024 © Faculdade de Economia da Universidade do Porto  I Terms and Conditions  I Acessibility  I Index A-Z  I Guest Book
Page created on: 2024-08-22 at 17:37:58 | Acceptable Use Policy | Data Protection Policy | Complaint Portal
SAMA2