Go to:
Logótipo
Você está em: Start » Publications » View » Improving word embeddings in Portuguese: increasing accuracy while reducing the size of the corpus
Publication

Improving word embeddings in Portuguese: increasing accuracy while reducing the size of the corpus

Title
Improving word embeddings in Portuguese: increasing accuracy while reducing the size of the corpus
Type
Article in International Scientific Journal
Year
2022-07-18
Authors
Maria Teresa Andrade
(Author)
FEUP
View Personal Page You do not have permissions to view the institutional email. Search for Participant Publications View Authenticus page View ORCID page
Viana P.
(Author)
Other
The person does not belong to the institution. The person does not belong to the institution. The person does not belong to the institution. Without AUTHENTICUS Without ORCID
Pinto JP
(Author)
Other
The person does not belong to the institution. The person does not belong to the institution. The person does not belong to the institution. Without AUTHENTICUS Without ORCID
Journal
Vol. 58
Final page: e964
Publisher: PEERJ INC
Indexing
Publicação em ISI Web of Knowledge ISI Web of Knowledge - 0 Citations
Publicação em Scopus Scopus - 0 Citations
Other information
Authenticus ID: P-00X-396
Abstract (EN): The subjectiveness of multimedia content description has a strong negative impact on tag-based information retrieval. In our work, we propose enhancing available descriptions by adding semantically related tags. To cope with this objective, we use a word embedding technique based on the Word2Vec neural network parameterized and trained using a new dataset built from online newspapers. A large number of news stories was scraped and pre-processed to build a new dataset. Our target language is Portuguese, one of the most spoken languages worldwide. The results achieved significantly outperform similar existing solutions developed in the scope of different languages, including Portuguese. Contributions include also an online application and API available for external use. Although the presented work has been designed to enhance multimedia content annotation, it can be used in several other application areas.
Language: English
Type (Professor's evaluation): Scientific
No. of pages: 22
Documents
We could not find any documents associated to the publication.
Related Publications

Of the same journal

Supervised deep learning embeddings for the prediction of cervical cancer diagnosis (2018)
Article in International Scientific Journal
Kelwin Fernandes; Davide Chicco; Jaime S. Cardoso; Jessica Fernandes
Ordinal losses for classification of cervical cancer risk (2021)
Article in International Scientific Journal
Tomé Albuquerque; Ricardo Cruz; Jaime S. Cardoso
Formal verification of Matrix based MATLAB models using interactive theorem proving (2021)
Article in International Scientific Journal
Gauhar, A; Rashid, A; Hasan, O; João Bispo; João M. P. Cardoso
Recommend this page Top
Copyright 1996-2024 © Faculdade de Medicina da Universidade do Porto  I Terms and Conditions  I Acessibility  I Index A-Z  I Guest Book
Page created on: 2024-10-03 at 14:38:12
Acceptable Use Policy | Data Protection Policy | Complaint Portal | Política de Captação e Difusão da Imagem Pessoal em Suporte Digital