Resumo (PT):
Abstract (EN):
Narratives have been the subject of extensive research across various scientific fields such as linguistics and
computer science. However, the scarcity of freely available datasets, essential for studying this genre, remains a
significant obstacle. Furthermore, datasets annotated with narratives components and their morphosyntactic and
semantic information are even scarcer. To address this gap, we developed the Text2Story Lusa datasets, which
consist of a collection of news articles in European Portuguese. The first datasets consists of 357 news articles
and the second dataset comprises a subset of 117 manually densely annotated articles, totaling over 50 thousand
individual annotations. By focusing on texts with substantial narrative elements, we aim to provide a valuable
resource for studying narrative structures in European Portuguese news articles. On the one hand, the first dataset
provides researchers with data to study narratives from various perspectives. On the other hand, the annotated
dataset facilitates research in information extraction and related tasks, particularly in the context of narrative extraction
pipelines. Both datasets are made available adhering to FAIR principles, thereby enhancing their utility within the
research community.
Language:
English
Type (Professor's evaluation):
Scientific
Contact:
Disponível em: https://aclanthology.org/2024.lrec-main.1370/