Você está em: Início > Publicações > Visualização > Clustering Documents Using Tagging Communities and Semantic Proximity

Publicação

Pesquisa de Publicações

Clustering Documents Using Tagging Communities and Semantic Proximity

Título

Clustering Documents Using Tagging Communities and Semantic ProximityExportar publicação no formato APA Exportar publicação no formato EXCEL Exportar publicação no formato RIS

Tipo

Artigo em Livro de Atas de Conferência Internacional

Data

2013

Título

Clustering Documents Using Tagging Communities and Semantic Proximity

Tipo

Artigo em Livro de Atas de Conferência Internacional

Ano

2013

Autores

Elisabete Cunha

(Autor)

Outra

A pessoa não pertence à instituição. A pessoa não pertence à instituição. A pessoa não pertence à instituição. Sem AUTHENTICUS Sem ORCID

Alvaro Figueira

(Autor)

FCUP

Ver página pessoal Sem permissões para visualizar e-mail institucional Pesquisar Publicações do Participante Ver página do Authenticus Ver página ORCID

Oscar Mealha

(Autor)

Outra

A pessoa não pertence à instituição. A pessoa não pertence à instituição. A pessoa não pertence à instituição. Ver página do Authenticus Sem ORCID

Ata de Conferência Internacional

Título: PROCEEDINGS OF THE 2013 8TH IBERIAN CONFERENCE ON INFORMATION SYSTEMS AND TECHNOLOGIES (CISTI 2013) Pesquisar Publicações da Ata de Conferência

8th Iberian Conference on Information Systems and Technologies (CISTI)

Lisboa, PORTUGAL, JUN 19-22, 2013

Indexação

ISI Web of Knowledge - 0 Citações

Scopus - 2 Citações

Outras Informações

ID Authenticus: P-008-GWC

Abstract (EN): Euclidean distance and cosine similarity are frequently used measures to implement the k-means clustering algorithm. The cosine similarity is widely used because of it's independence from document length, allowing the identification of patterns, more specifically, two documents can be seen as identical if they share the same words but have different frequencies. However, during each clustering iteration new centroids are still computed following Euclidean distance. Based on a consideration of these two measures we propose the k-Communities clustering algorithm (k-C) which changes the computing of new centroids when using cosine similarity. It begins by selecting the seeds considering a network of tags where a community detection algorithm has been implemented. Each seed is the document which has the greater degree inside its community. The experimental results found through implementing external evaluation measures show that the k-C algorithm is more effective than both the k-means and k-means++. Besides, we implemented all the external evaluation measures, using both a manual and an automatic "Ground Truth", and the results show a great correlation which is a strong indicator that it is possible to perform tests with this kind of measures even if the dataset structure is unknown.

Idioma: Inglês

Tipo (Avaliação Docente): Científica

Nº de páginas: 6

Documentos

Não foi encontrado nenhum documento associado à publicação.

Recomendar Página Voltar ao Topo

Copyright 1996-2025 © Centro de Desporto da Universidade do Porto I Termos e Condições I Acessibilidade I Índice A-Z
Página gerada em: 2025-12-09 às 02:25:13 | Política de Privacidade | Política de Proteção de Dados Pessoais | Denúncias | Livro Amarelo Eletrónico