Resumo (PT):
Abstract
Background
BioTextRetriever is a Web-based search tool for retrieving relevant literature in Molecular Biology and related domains from MEDLINE. The core of BioTextRetriever is the dynamic construction of a classifier capable of selecting relevant papers among the whole MEDLINE bibliographic database. “Relevant” papers, in this context, means papers related to a set of DNA or protein sequences provided as input to the tool by the user.
Methods
Since the number of retrieved papers may be very large, BioTextRetriever uses a novel ranking algorithm to retrieve the most relevant papers first. We have developed a new methodology that enables the automation of the assessment process based on a multi-criteria ranking function. This function combines six factors: MeSH terms, paper’s number of citations, author’s h-index, journals impact factor, author number of publications and journal similarity function.
Results
The best results highlight the number of citations and the h-index factors.
Conclusions
We have developed and a multi-criteria ranking function, that contemplates six factors, and that seems appropriate to retrieve relevant papers out of a huge repository such as MEDLINE.
Keywords:
Ranking; Text mining; Machine learning
Abstract (EN):
Background: BioTextRetriever is a Web-based search tool for retrieving relevant literature in Molecular Biology and related domains from MEDLINE. The core of BioTextRetriever is the dynamic construction of a classifier capable of selecting relevant papers among the whole MEDLINE bibliographic database. ¿Relevant¿ papers, in this context, means papers related to a set of DNA or protein sequences provided as input to the tool by the user. Methods: Since the number of retrieved papers may be very large, BioTextRetriever uses a novel ranking algorithm to retrieve the most relevant papers first. We have developed a new methodology that enables the automation of the assessment process based on a multi-criteria ranking function. This function combines six factors: MeSH terms, paper¿s number of citations, author¿s h-index, journals impact factor, author number of publications and journal similarity function. Results: The best results highlight the number of citations and the h-index factors. Conclusions: We have developed and a multi-criteria ranking function, that contemplates six factors, and that seems appropriate to retrieve relevant papers out of a huge repository such as MEDLINE. © 2014, Gonçalves et al.; licensee Springer.
Idioma:
Inglês
Tipo (Avaliação Docente):
Científica
Nº de páginas:
16