Go to:
Logótipo
Você está em: Start > Publications > View > Human Experts vs. Large Language Models: Evaluating Annotation Scheme and Guidelines Development for Clinical
Publication

Human Experts vs. Large Language Models: Evaluating Annotation Scheme and Guidelines Development for Clinical

Title
Human Experts vs. Large Language Models: Evaluating Annotation Scheme and Guidelines Development for Clinical
Type
Article in International Conference Proceedings Book
Year
2025
Authors
Fernandes, Ana Luísa Cardoso
(Author)
Other
View Personal Page You do not have permissions to view the institutional email. Search for Participant Publications Without AUTHENTICUS Without ORCID
Silvano, Maria da Purificação
(Author)
FLUP
View Personal Page You do not have permissions to view the institutional email. Search for Participant Publications Without AUTHENTICUS Without ORCID
Guimaraes, Nuno
(Author)
FEUP
View Personal Page You do not have permissions to view the institutional email. Search for Participant Publications View Authenticus page View ORCID page
Rb-Silva, Rita
(Author)
Other
The person does not belong to the institution. The person does not belong to the institution. The person does not belong to the institution. Without AUTHENTICUS Without ORCID
Conference proceedings International
Text2Story — Eighth Workshop on Narrative Extraction From Texts held in conjunction with the 47th European Conference on Information Retrieval (ECIR 2025)
Lucca, 2025
Scientific classification
CORDIS: Humanities > language sciences > Linguistics
Other information
Resumo (PT):
Abstract (EN): Electronic Health Records (EHRs) contain vast amounts of unstructured narrative text, posing challenges for organization, curation, and automated information extraction in clinical and research settings. Developing e"ective annotation schemes is crucial for training extraction models, yet it remains complex for both human experts and Large Language Models (LLMs). This study compares human- and LLM-generated annotation schemes and guidelines through an experimental framework. In the !rst phase, both a human expert and an LLM created annotation schemes based on prede!ned criteria. In the second phase, experienced annotators applied these schemes following the guidelines. In both cases, the results were qualitatively evaluated using Likert scales. The !ndings indicate that the human-generated scheme is more comprehensive, coherent, and clear compared to those produced by the LLM. These results align with previous research suggesting that while LLMs show promising performance with respect to text annotation, the same does not apply to the development of annotation schemes, and human validation remains essential to ensure accuracy and reliability.
Language: English
Type (Professor's evaluation): Scientific
Contact: Disponívele em: https://ceur-ws.org/Vol-3964/paper13.pdf
Notes: Também participaram neste artigo os seguintes autores: Tahsir Ahmed Munna1, Filipe Cunha1, António Leal, Ricardo Campos e Alípio Jorge.
No. of pages: 11
Documents
File name Description Size
Human 331.61 KB
Recommend this page Top
Copyright 1996-2026 © Faculdade de Farmácia da Universidade do Porto  I Terms and Conditions  I Acessibility  I Index A-Z
Page created on: 2026-02-22 at 15:39:10 | Privacy Policy | Personal Data Protection Policy | Whistleblowing | Electronic Yellow Book