Resumo (PT):
Abstract (EN):
This research investigates how to improve machine translation systems for low-resource languages by integrating loanword constraints as external linguistic knowledge. Focusing on the Portuguese-Emakhuwa language pair, which exhibits significant lexical borrowing, we address the challenge of effectively adapting loanwords during the translation process. To tackle this, we propose a novel approach that augments source sentences with loanword constraints, explicitly linking source-language loanwords to their target-language equivalents. Then, we perform supervised fine-tuning on multilingual neural machine translation models and multiple Large Language Models of different sizes. Our results demonstrate that incorporating loanword constraints leads to significant improvements in translation quality as well as in handling loanword adaptation correctly in target languages, as measured by different machine translation metrics. This approach offers a promising direction for improving machine translation performance in low-resource settings characterized by frequent lexical borrowing.
Language:
English
Type (Professor's evaluation):
Scientific