Natural Language Processing
Keywords |
Classification |
Keyword |
OFICIAL |
Artificial Intelligence |
Instance: 2022/2023 - 2S
Cycles of Study/Courses
Acronym |
No. of Students |
Study Plan |
Curricular Years |
Credits UCN |
Credits ECTS |
Contact hours |
Total Time |
M.EIC |
28 |
Syllabus |
1 |
- |
6 |
39 |
162 |
Teaching language
English
Objectives
This course provides an introduction to the field of Natural Language Processing (NLP). By the end of the course, students should have acquired a comprehensive understanding of the field and its state-of-the-art, and recent research trends.
Learning outcomes and competences
Learning goals include:
- Acquire the fundamental linguistic concepts that are relevant to processing natural language text.
- Understand both basic and state-of-the-art algorithms and techniques for dealing with natural language text.
- Familiarize with state-of-the-art NLP tools and linguistic resources.
- Understand and employ evaluation metrics for different NLP tasks.
- Be able to formulate an NLP classification problem and address it with the appropriate techniques, algorithms, and tools.
- Read and understand current research on natural language processing.
Working method
Presencial
Pre-requirements (prior knowledge) and co-requirements (common knowledge)
Knowledge of Python programming.
Basic knowledge of machine learning techniques.
Program
- Introduction to natural language processing: definitions, tasks, and applications.
- Basic text processing: regular expressions, tokenization, normalization, lemmatization, stemming, segmentation.
- Language models: n-grams.
- Text classification: bag-of-words, Naive Bayes, feature engineering; generative and discriminative classifiers.
- Vectorized representations of words: lexical semantics, word embeddings.
- Sequence models: hidden Markov models, conditional random fields; POS-tagging and named entity recognition.
- Neural networks in natural language processing: neural language models, recurrent neural networks, encoder-decoder networks, attention, transformer networks.
- Contemporary research in natural language processing and information extraction.
Mandatory literature
Daniel Jurafsky;
Speech and language processing. ISBN: 0-13-095069-6
Complementary Bibliography
Jacob Eisenstein;
Introduction to natural language processing. ISBN: 978-0-262-04284-0
Yoav Goldberg;
Neural network methods for natural language processing. ISBN: 978-1-62705-298-6
Teaching methods and learning activities
Course topics will be covered with motivating applications, and with source code examples, where applicable. The aim is to introduce the tools that are to be used in practical assignments as soon as possible. At the same time, pointers to related literature will be given as further reading opportunities. Students will be asked to make short presentations on recent research trends in NLP. Short in-class quizzes will be used to ensure the retention of the main concepts.
Evaluation Type
Distributed evaluation with final exam
Assessment Components
Designation |
Weight (%) |
Apresentação/discussão de um trabalho científico |
10,00 |
Exame |
30,00 |
Trabalho laboratorial |
60,00 |
Total: |
100,00 |
Amount of time allocated to each course unit
Designation |
Time (hours) |
Apresentação/discussão de um trabalho científico |
3,00 |
Estudo autónomo |
40,00 |
Frequência das aulas |
39,00 |
Trabalho de investigação |
10,00 |
Trabalho laboratorial |
70,00 |
Total: |
162,00 |
Eligibility for exams
A minimum grade of 50% in each of the assessment components.
Calculation formula of final grade
Evaluation will be composed of:
- 2 practical assignments (2x6/20)
- 1 oral presentation related to a recent research direction (2/20)
- 1 final exam (6/20)
For approval, a minimum grade of 35% is required in the final exam.
Examinations or Special Assignments
Evaluation in special seasons consists of two practical Assignments and a written Exam, where each of these components weighs 50% on the final grade. Approval in the course requires a minimum score of 50% in each of the practical assignments, and a minimum of 35% in the written exam.
Special assessment (TE, DA, ...)
All assessment components are required for all students. Students enrolled using special frequency modes, without obligation to attend the classes, must arrange with teachers appropriate consultation and evaluation sessions.
Classification improvement
The improvement of classification in the distributed component (assignments and oral presentation) can only be obtained in the next edition of the course.