Go to:

Content (tecla de atalho: c)
Options (tecla de atalho: o)
Main Menu (tecla de atalho: m)
Iniciar sessão autenticada (tecla de atalho: s)

Logótipo

Comuta visibilidade da coluna esquerda

Logótipo

FPCEUP

help

Comuta visibilidade da coluna direita

Você está em: Start > Publications > View > Evaluation of Lyrics Extraction from Folk Music Sheets Using Vision Language Models (VLMs)

Map of Premises

Publication

Publication Search

Evaluation of Lyrics Extraction from Folk Music Sheets Using Vision Language Models (VLMs)

Title

Evaluation of Lyrics Extraction from Folk Music Sheets Using Vision Language Models (VLMs)Export publication in the APA format Export publication in the EXCEL format Export publication in the RIS format

Type

Article in International Conference Proceedings Book

Date

2025

Title

Evaluation of Lyrics Extraction from Folk Music Sheets Using Vision Language Models (VLMs)

Type

Article in International Conference Proceedings Book

Year

2025

Authors

Mendes, AS

(Author)

Other

The person does not belong to the institution. The person does not belong to the institution. The person does not belong to the institution. Without AUTHENTICUS Without ORCID

Murciego, AL

(Author)

Other

The person does not belong to the institution. The person does not belong to the institution. The person does not belong to the institution. Without AUTHENTICUS Without ORCID

Silva, LA

(Author)

Other

The person does not belong to the institution. The person does not belong to the institution. The person does not belong to the institution. Without AUTHENTICUS Without ORCID

Jiménez-Bravo, DM

(Author)

Other

The person does not belong to the institution. The person does not belong to the institution. The person does not belong to the institution. Without AUTHENTICUS Without ORCID

Navarro-Cáceres, M

(Author)

Other

The person does not belong to the institution. The person does not belong to the institution. The person does not belong to the institution. Without AUTHENTICUS Without ORCID

Bernardes, G

(Author)

FEUP

View Personal Page Send message Search for Participant Publications View Authenticus page View ORCID page

Conference proceedings International

Title: PROGRESS IN ARTIFICIAL INTELLIGENCE, EPIA 2024, PT I Search for Conference Proceedings Publications

Pages: 91-102

23rd EPIA Conference on Artificial Intelligence-EPIA

Viana do Castelo, PORTUGAL, SEP 03-06, 2024

Indexing

ISI Web of Knowledge - 0 Citations

Scopus - 0 Citations

Other information

Authenticus ID: P-017-C9T

DOI: 10.1007/978-3-031-73497-7_8

Abstract (EN): Monodic folk music has traditionally been preserved in physical documents. It constitutes a vast archive that needs to be digitized to facilitate comprehensive analysis using AI techniques. A critical component of music score digitization is the transcription of lyrics, an extensively researched process in Optical Character Recognition (OCR) and document layout analysis. These fields typically require the development of specific models that operate in several stages: first, to detect the bounding boxes of specific texts, then to identify the language, and finally, to recognize the characters. Recent advances in vision language models (VLMs) have introduced multimodal capabilities, such as processing images and text, which are competitive with traditional OCR methods. This paper proposes an end-to-end system for extracting lyrics from images of handwritten musical scores. We aim to evaluate the performance of two state-of-the-art VLMs to determine whether they can eliminate the need to develop specialized text recognition and OCR models for this task. The results of the study, obtained from a dataset in a real-world application environment, are presented along with promising new research directions in the field. This progress contributes to preserving cultural heritage and opens up new possibilities for global analysis and research in folk music.

Language: English

Type (Professor's evaluation): Scientific

No. of pages: 12

Documents

We could not find any documents associated to the publication.

Recommend this page Top

Copyright 1996-2025 © Faculdade de Psicologia e de Ciências da Educação da Universidade do Porto I Terms and Conditions I Acessibility I Index A-Z
Page created on: 2025-12-02 at 23:18:27 | Privacy Policy | Personal Data Protection Policy | Whistleblowing | Electronic Yellow Book