Go to:
Logótipo
You are here: Start > PRODEI034

Language Processing and Information Extraction

Code: PRODEI034     Acronym: PLEI

Keywords
Classification Keyword
OFICIAL Intelligent Systems

Instance: 2020/2021 - 1S Ícone do Moodle

Active? Yes
Web Page: https://moodle.up.pt/enrol/index.php?id=1578
Responsible unit: Department of Informatics Engineering
Course/CS Responsible: Doctoral Program in Informatics Engineering

Cycles of Study/Courses

Acronym No. of Students Study Plan Curricular Years Credits UCN Credits ECTS Contact hours Total Time
PRODEI 13 Syllabus 1 - 6 28 162
Mais informaçõesLast updated on 2020-09-13.

Fields changed: Teaching methods and learning activities, Programa, Bibliografia Complementar, Bibliografia Obrigatória, URL da página

Teaching language

Suitable for English-speaking students

Objectives

The main objective of this course is to equip students with knowledge about natural language processing and information extraction techniques, combining the presentation of theoretical foudations with pratical applications.

Learning outcomes and competences

Upon completing the course students should be able to:

- Explain the fundamental concepts and techniques in natural language processing and information extraction;
- Demonstrate knowledge of relevant literature and be able to synthesize and present research work;
- Design and implement systems that perform analysis and automatic extraction of information expressed in natural language.

Working method

Presencial

Program

The curricular unit will be organized in two parts that include the theoretical component and a project component. The theoretical component will introduce concepts on language processing and information extraction and discuss recent literature on the subject.

The project component will allow students to apply these concepts in practical case studies. Students will perform research, development, and evaluation of a solution of language processing and information extraction. During the research and development stages, students will be accompanied under the tutorship.

The course will address the following topics:
- Introduction to natural language processing: definitions, tasks, and applications.
- Basic text processing: regular expressions, tokenization, normalization, lemmatization, stemming, segmentation.
- Language models: n-grams.
- Text classification: bag-of-words, Naive Bayes, feature engineering; generative and discriminative classifiers.
- Vectorized representations of words: lexical semantics, word embeddings.
- Sequence models: hidden Markov models, conditional random fields; POS-tagging and named entity recognition.
- Neural networks in natural language processing: neural language models, recurrent neural networks, encoder-decoder networks, attention, transformer networks.
- Information extraction: named entity recognition and relation extraction, event and time extraction, template filling.
- Contemporary research in natural language processing and information extraction.

Mandatory literature

Daniel Jurafsky; Speech and language processing. ISBN: 0-13-095069-6 (https://web.stanford.edu/~jurafsky/slp3/)

Complementary Bibliography

Christopher D. Manning, Prabhakar Raghavan, Hinrich Schutze; Introduction to information retrieval. ISBN: 978-0-521-86571-5 (Full content available at http://nlp.stanford.edu/IR-book/)
Steven Bird, Ewan Klein, Edward Loper; Natural Language Processing with Python, O'Reilly Media, 2009. ISBN: 978-0-596-51649-9 (Full content available at http://www.nltk.org/book/)
Yoav Goldberg; Neural network methods for natural language processing. ISBN: 978-1-62705-298-6
Jacob Eisenstein; Introduction to natural language processing. ISBN: 978-0-262-04284-0

Teaching methods and learning activities

Students will have to attend lectures. Individual research work will be supported by the teacher on a one-to-one basis.

Students define and develop a semester-long project. Project themes are proposed by the students and validated with the teacher.  

The evaluation of the project is based on two components:
1) SP: short-paper - 30% of the final grade
2) FP: full-paper - 70% of the final grade.

The SP component will be evaluated halfway through the semester and will consist of:
- SP1: the student will have to prepare a short paper describing the first investigations in tackling the selected problem.
- SP2: short presentation (10 minutes) on the work done so far.

The FP component will be evaluated at the end of the semester and consists of:
- FP1: a full-paper (written in English) containing a description of the final solution of the problem, and results of the evaluation experiments regarding the proposed solution.
- FP2: public presentation (25 minutes) and demonstration of the project.

keywords

Technological sciences > Engineering > Computer engineering

Evaluation Type

Distributed evaluation without final exam

Assessment Components

Designation Weight (%)
Prova oral 35,00
Trabalho escrito 65,00
Total: 100,00

Amount of time allocated to each course unit

Designation Time (hours)
Elaboração de projeto 56,00
Estudo autónomo 42,00
Frequência das aulas 42,00
Total: 140,00

Eligibility for exams

In all evaluation components (SP1, SP2, FP1 and FP2) a minimum score of 7 out of 20 is required. For successfully obtaining a final grade, students must obtain a minimum score on the four components.

Calculation formula of final grade

The final grade (CF) is calculated as follows:

CF = (20% * SP1 + 10% * SP2) + (45% * FP1 + 25% * FP2).

Evaluation components:
- SP1: short-paper
- SP2: short presentation (10 minutes)
- FP1: full-paper
- FP2: public presentation (25 minutes) and demonstration of the project.

Special assessment (TE, DA, ...)

Students under special evaluation constraints are allowed to skip lectures. However, they still have to make the public presentations described in the previous section and the final grades will be given according to evaluation criteria already described. In these cases, the students must schedule regular meetings with the teacher to discuss the ongoing work.

Classification improvement

Only the end-of-semester evaluation (70%) can be subject to grade improvement. The student will have to submit a new research work (i.e. full-paper) and make the corresponding public presentation.

Recommend this page Top
Copyright 1996-2025 © Faculdade de Engenharia da Universidade do Porto  I Terms and Conditions  I Accessibility  I Index A-Z  I Guest Book
Page generated on: 2025-06-14 at 09:24:39 | Acceptable Use Policy | Data Protection Policy | Complaint Portal