Português help

Comuta visibilidade da coluna direita

Você está em: Start > Publications > View > Clustering Documents Using Tagging Communities and Semantic Proximity

Map of Premises

Publication

Publication Search

Publications

Clustering Documents Using Tagging Communities and Semantic Proximity

Title

Clustering Documents Using Tagging Communities and Semantic ProximityExport publication in the APA format Export publication in the EXCEL format Export publication in the RIS format

Type

Article in International Conference Proceedings Book

Date

2013

Title

Clustering Documents Using Tagging Communities and Semantic Proximity

Type

Article in International Conference Proceedings Book

Year

2013

Authors

Elisabete Cunha

(Author)

Other

The person does not belong to the institution. The person does not belong to the institution. The person does not belong to the institution. Without AUTHENTICUS Without ORCID

Alvaro Figueira

(Author)

FCUP

View Personal Page You do not have permissions to view the institutional email. Search for Participant Publications View Authenticus page View ORCID page

Oscar Mealha

(Author)

Other

The person does not belong to the institution. The person does not belong to the institution. The person does not belong to the institution. View Authenticus page Without ORCID

Conference proceedings International

Title: PROCEEDINGS OF THE 2013 8TH IBERIAN CONFERENCE ON INFORMATION SYSTEMS AND TECHNOLOGIES (CISTI 2013) Search for Conference Proceedings Publications

8th Iberian Conference on Information Systems and Technologies (CISTI)

Lisboa, PORTUGAL, JUN 19-22, 2013

Indexing

ISI Web of Knowledge - 0 Citations

Scopus - 2 Citations

Other information

Authenticus ID: P-008-GWC

Abstract (EN): Euclidean distance and cosine similarity are frequently used measures to implement the k-means clustering algorithm. The cosine similarity is widely used because of it's independence from document length, allowing the identification of patterns, more specifically, two documents can be seen as identical if they share the same words but have different frequencies. However, during each clustering iteration new centroids are still computed following Euclidean distance. Based on a consideration of these two measures we propose the k-Communities clustering algorithm (k-C) which changes the computing of new centroids when using cosine similarity. It begins by selecting the seeds considering a network of tags where a community detection algorithm has been implemented. Each seed is the document which has the greater degree inside its community. The experimental results found through implementing external evaluation measures show that the k-C algorithm is more effective than both the k-means and k-means++. Besides, we implemented all the external evaluation measures, using both a manual and an automatic "Ground Truth", and the results show a great correlation which is a strong indicator that it is possible to perform tests with this kind of measures even if the dataset structure is unknown.

Language: English

Type (Professor's evaluation): Scientific

No. of pages: 6

Documents

We could not find any documents associated to the publication.

Recommend this page Top

Copyright 1996-2025 © Faculdade de Direito da Universidade do Porto I Terms and Conditions I Acessibility I Index A-Z
Page created on: 2025-12-21 at 17:15:14 | Privacy Policy | Personal Data Protection Policy | Whistleblowing | Electronic Yellow Book