Go to:
Logótipo
Comuta visibilidade da coluna esquerda
Você está em: Start > Publications > View > A Bootstrapping Approach for Training a NER with Conditional Random Fields
Publication

Publications

A Bootstrapping Approach for Training a NER with Conditional Random Fields

Title
A Bootstrapping Approach for Training a NER with Conditional Random Fields
Type
Article in International Conference Proceedings Book
Year
2011
Authors
sarmento, l
(Author)
Other
The person does not belong to the institution. The person does not belong to the institution. The person does not belong to the institution. Without AUTHENTICUS Without ORCID
oliveira, e
(Author)
FEUP
View Personal Page You do not have permissions to view the institutional email. Search for Participant Publications View Authenticus page View ORCID page
Conference proceedings International
Pages: 664-678
15th Portuguese Conference on Artificial Intelligence (EPIA 2011)
Lisbon, PORTUGAL, OCT 10-13, 2011
Indexing
Scientific classification
FOS: Natural sciences > Computer and information sciences
Other information
Authenticus ID: P-002-VZ3
Abstract (EN): In this paper we present a bootstrapping approach for training a Named Entity Recognition (NER) system. Our method starts by annotating persons' names on a dataset of 50,000 news items. This is performed using a simple dictionary-based approach. Using such training set we build a classification model based on Conditional Random Fields (CRF). We then use the inferred classification model to perform additional annotations of the initial seed corpus, which is then used for training a new classification model. This cycle is repeated until the NER model stabilizes. We evaluate each of the bootstrapping iterations by calculating: (i) the precision and recall of the NER model in annotating a small gold-standard collection (HAREM); (ii) the precision and recall of the CRF bootstrapping annotation method over a small sample of news; and (iii) the correctness and the number of new names identified. Additionally, we compare the NER model with a dictionary-based approach, our baseline method. Results show that our bootstrapping approach stabilizes after 7 iterations, achieving high values of precision (83%) and recall (68%).
Language: English
Type (Professor's evaluation): Scientific
Contact: jft@fe.up.pt; las@fe.up.pt; eco@fe.up.pt
No. of pages: 15
Documents
We could not find any documents associated to the publication.
Related Publications

Of the same authors

Semi-Automatic Creation of a Reference News Corpus for Fine-Grained Multi-Label Scenarios (2011)
Article in International Conference Proceedings Book
teixeira, j; sarmento, l; oliveira, e
Comparing Verb Synonym Resources for Portuguese (2010)
Article in International Conference Proceedings Book
teixeira, j; sarmento, l; oliveira, e
Recommend this page Top
Copyright 1996-2025 © Faculdade de Direito da Universidade do Porto  I Terms and Conditions  I Acessibility  I Index A-Z
Page created on: 2025-07-12 at 08:11:34 | Privacy Policy | Personal Data Protection Policy | Whistleblowing