Você está em: Início > Publicações > Visualização > Selecting classification algorithms with active testing on similar datasets

Publicação

Pesquisa de Publicações

Selecting classification algorithms with active testing on similar datasets

Título

Selecting classification algorithms with active testing on similar datasetsExportar publicação no formato APA Exportar publicação no formato EXCEL Exportar publicação no formato RIS

Tipo

Artigo em Livro de Atas de Conferência Internacional

Data

2012

Título

Selecting classification algorithms with active testing on similar datasets

Tipo

Artigo em Livro de Atas de Conferência Internacional

Ano

2012

Autores

Rui Leite

(Autor)

FEP

Ver página pessoal Sem permissões para visualizar e-mail institucional Pesquisar Publicações do Participante Ver página do Authenticus Ver página ORCID

Pavel Brazdil

(Autor)

FEP

Ver página pessoal Sem permissões para visualizar e-mail institucional Pesquisar Publicações do Participante Ver página do Authenticus Sem ORCID

Vanschoren, J

(Autor)

Outra

A pessoa não pertence à instituição. A pessoa não pertence à instituição. A pessoa não pertence à instituição. Sem AUTHENTICUS Sem ORCID

Ata de Conferência Internacional

Título: CEUR Workshop Proceedings Pesquisar Publicações da Ata de Conferência

Páginas: 20-27

Workshop on Ubiquitous Data Mining, UDM 2012 - In Conjunction with the 20th European Conference on Artificial Intelligence, ECAI 2012

27 August 2012 through 31 August 2012

Indexação

Scopus - 4 Citações

Outras Informações

ID Authenticus: P-00K-SPX

Abstract (EN): Given the large amount of data mining algorithms, their combinations (e.g. ensembles) and possible parameter settings, finding the most adequate method to analyze a new dataset becomes an ever more challenging task. This is because in many cases testing all possibly useful alternatives quickly becomes prohibitively expensive. In this paper we propose a novel technique, called active testing, that intelligently selects the most useful cross-validation tests. It proceeds in a tournament-style fashion, in each round selecting and testing the algorithm that is most likely to outperform the best algorithm of the previous round on the new dataset. This 'most promising' competitor is chosen based on a history of prior duels between both algorithms on similar datasets. Each new cross-validation test will contribute information to a better estimate of dataset similarity, and thus better predict which algorithms are most promising on the new dataset. We also follow a different path to estimate dataset similarity based on data characteristics. We have evaluated this approach using a set of 292 algorithm-parameter combinations on 76 UCI datasets for classification. The results show that active testing will quickly yield an algorithm whose performance is very close to the optimum, after relatively few tests. It also provides a better solution than previously proposed methods. The variants of our method that rely on crossvalidation tests to estimate dataset similarity provides better solutions than those that rely on data characteristics.

Idioma: Inglês

Tipo (Avaliação Docente): Científica

Documentos

Não foi encontrado nenhum documento associado à publicação.

Recomendar Página Voltar ao Topo

Copyright 1996-2025 © Centro de Desporto da Universidade do Porto I Termos e Condições I Acessibilidade I Índice A-Z
Página gerada em: 2025-11-01 às 13:17:55 | Política de Privacidade | Política de Proteção de Dados Pessoais | Denúncias | Livro Amarelo Eletrónico