Go to:
Logótipo
Comuta visibilidade da coluna esquerda
Você está em: Start > Publications > View > Efficient clustering of web-derived data sets
Publication

Publications

Efficient clustering of web-derived data sets

Title
Efficient clustering of web-derived data sets
Type
Article in International Conference Proceedings Book
Year
2009
Authors
Luís Sarmento
(Author)
Other
The person does not belong to the institution. The person does not belong to the institution. The person does not belong to the institution. Without AUTHENTICUS Without ORCID
Alexander Kehlenbeck
(Author)
Other
The person does not belong to the institution. The person does not belong to the institution. The person does not belong to the institution. Without AUTHENTICUS Without ORCID
Eugénio Oliveira
(Author)
FEUP
View Personal Page You do not have permissions to view the institutional email. Search for Participant Publications View Authenticus page View ORCID page
Lyle Ungar
(Author)
Other
The person does not belong to the institution. The person does not belong to the institution. The person does not belong to the institution. Without AUTHENTICUS Without ORCID
Conference proceedings International
Pages: 398-412
6th International Conference on Machine Learning and Data Mining in Pattern Recognition (MLDM 2009)
Leipzig, Germany, 23-25 July, 2009
Indexing
Publicação em Scopus Scopus - 0 Citations
INSPEC
Scientific classification
FOS: Natural sciences > Computer and information sciences
CORDIS: Physical sciences > Computer science > Informatics
Other information
Authenticus ID: P-003-R7P
Abstract (EN): Many data sets derived from the web are large, high-dimensional, sparse and have a Zipfian distribution of both classes and features. On such data sets, current scalable clustering methods such as streaming clustering suffer from fragmentation. where large classes are incorrectly divided into many smaller clusters. and computational efficiency drops significantly. We present a new clustering algorithm based on connected components that addresses these issues and so works well oil web-type data.
Language: English
Type (Professor's evaluation): Scientific
Contact: las@fe.up.pt; apk@google.com; eco@fe.up.pt; ungar@cis.upenn.edu
No. of pages: 15
License type: Click to view license CC BY-NC
Documents
File name Description Size
Efficient clustering of web-derived data sets 276.64 KB
Related Publications

Of the same authors

An Approach to Web-Scale Named-Entity Disambiguation (2009)
Article in International Conference Proceedings Book
Luís Sarmento; Alexander Kehlenbeck; Eugénio Oliveira; Lyle Ungar

Of the same scientific areas

SIGA-Sistema Integrado de Gestão Autárquica, (1987)
Technical Report
Gabriel David; Vladimiro Miranda; Maria Cristina Ribeiro
Moodle at FEUP (2005)
Technical Report
Jaime Enrique Villate Matiz
Studying the Impact of the Organizational Structure on Airline Operations Control (2015)
Chapter or Part of a Book
Nuno Machado; António Castro; Eugénio Oliveira
Normative and trust-based systems as enabler technologies for automated negotiation (2014)
Chapter or Part of a Book
Maria Joana Urbano; Henrique Lopes Cardoso; Eugénio Oliveira; Ana Paula Rocha

See all (65)

Recommend this page Top
Copyright 1996-2025 © Faculdade de Direito da Universidade do Porto  I Terms and Conditions  I Acessibility  I Index A-Z
Page created on: 2025-08-09 at 10:18:00 | Privacy Policy | Personal Data Protection Policy | Whistleblowing