Abstract (EN):
The task of Question Generation (QG) has attracted the interest of the natural language processing community in recent years. QG aims to automatically generate well-formed questions from an input (e.g., text), which can be especially relevant for computer-supported educational platforms. Recent work relies on large-scale question-answering (QA) datasets (in English) to train and build the QG systems. However, large-scale quality QA datasets are not widely available for lower-resourced languages. In this respect, this research addresses the task of QG in a lower-resourced language ¿ Portuguese ¿ using a traditional rule-based approach for generating wh-questions. We perform a feasibility analysis of the approach through a comprehensive evaluation supported by two studies: (1) comparing the similarity between machine-generated and human-authored questions using automatic metrics, and (2) comparing the perceived quality of machine-generated questions to those elaborated by humans. Although the results show that rule-based generated questions fall short in quality compared to those authored by humans, they also suggest that a rule-based approach remains a feasible alternative to neural-based techniques when these are not viable. The code is publicly available at https://github.com/bernardoleite/question-generation-portuguese. Copyright © 2023 by SCITEPRESS ¿ Science and Technology Publications, Lda. Under CC license (CC BY-NC-ND 4.0)
Language:
English
Type (Professor's evaluation):
Scientific
No. of pages:
11