Go to:
Logótipo
You are in:: Start > Publications > View > Expanding FLORES+ benchmark for more low-resource settings: Portuguese-Emakhuwa machine translation evaluation
Map of Premises
FC6 - Departamento de Ciência de Computadores FC5 - Edifício Central FC4 - Departamento de Biologia FC3 - Departamento de Física e Astronomia e Departamento GAOT FC2 - Departamento de Química e Bioquímica FC1 - Departamento de Matemática
Publication

Expanding FLORES+ benchmark for more low-resource settings: Portuguese-Emakhuwa machine translation evaluation

Title
Expanding FLORES+ benchmark for more low-resource settings: Portuguese-Emakhuwa machine translation evaluation
Type
Article in International Conference Proceedings Book
Year
2024
Authors
Ali, Felermino
(Author)
Other
View Personal Page You do not have permissions to view the institutional email. Search for Participant Publications Without AUTHENTICUS Without ORCID
Conference proceedings International
Pages: 579-592
9th Conference on Machine Translation
Miami, 2024
Other information
Resumo (PT):
Abstract (EN): As part of the Open Language Data Initiative shared tasks, we have expanded the FLORES+ evaluation set to include Emakhuwa, a lowresource language widely spoken in Mozambique. We translated the dev and devtest sets from Portuguese into Emakhuwa, and we detail the translation process and quality assurance measures used. Our methodology involved various quality checks, including postediting and adequacy assessments. The resulting datasets consist of multiple reference sentences for each source. We present baseline results from training a Neural Machine Translation system and fine-tuning existing multilingual translation models. Our findings suggest that spelling inconsistencies remain a challenge in Emakhuwa. Additionally, the baseline models underperformed on this evaluation set, underscoring the necessity for further research to enhance machine translation quality for Emakhuwa. The data is publicly available at https://huggingface.co/ datasets/LIACC/Emakhuwa-FLORES
Language: English
Type (Professor's evaluation): Scientific
Contact: Disponível em: https://arxiv.org/abs/2408.11457
Documents
File name Description Size
2024.wmt-1.45 1981.27 KB
Related Publications

Of the same authors

Expanding FLORES+ Benchmark for more Low-Resource Settings: Portuguese-Emakhuwa Machine Translation Evaluation (2024)
Poster in an International Conference
Felermino Ali; Henrique Lopes Cardoso; Sousa-Silva, Rui
Building Resources for Emakhuwa: Machine Translation and News Classification Benchmarks (2024)
Poster in an International Conference
Felermino Ali; Henrique Lopes Cardoso; Sousa-Silva, Rui
SSA-COMET: Do LLMs outperform learned metrics in evaluating MT for under-resourced African languages? (2025)
Article in International Conference Proceedings Book
Li, Senyu; Wang, Jiayi; Ali, Felermino D. M. A.; Cherry, Colin ; Deutsch, Daniel; Briakou, Eleftheria ; Sousa-Silva, Rui; Cardoso, Henrique Lopes ; Stenetorp, Pontus; Adelani, David Ifeoluwa
Leveraging loanword constraints for improving machine translation in a low-resource multilingual context (2025)
Article in International Conference Proceedings Book
Ali, Felermino D. M. A. ; Cardoso, Henrique Lopes ; Sousa-Silva, Rui
Evaluating WMT 2025 Metrics shared task submissions on the SSA-MTE African challenge set (2025)
Article in International Conference Proceedings Book
Li, Senyu; Ali, Felermino D. M. A.; Wang, Jiayi ; Sousa-Silva, Rui; Cardoso, Henrique Lopes ; Stenetorp, Pontus ; Cherry, Colin; Adelani, David Ifeoluwa

See all (7)

Recommend this page Top
Copyright 1996-2026 © Faculdade de Ciências da Universidade do Porto  I Terms and Conditions  I Acessibility  I Index A-Z
Page created on: 2026-04-23 at 18:33:35 | Privacy Policy | Personal Data Protection Policy | Whistleblowing | Electronic Yellow Book