Language Processing and Information Extraction

Code:

PRODEI034

Acronym:

PLEI

Keywords
Classification	Keyword
OFICIAL	Intelligent Systems

Instance: 2012/2013 - 1S

Active?	Yes
Responsible unit:	Department of Informatics Engineering
Course/CS Responsible:	Doctoral Program in Informatics Engineering

Cycles of Study/Courses

Acronym	No. of Students	Study Plan	Curricular Years	Credits UCN	Credits ECTS	Contact hours	Total Time
PRODEI	2	Syllabus	1	-	6	54	162

Teaching language

Suitable for English-speaking students

Objectives

The main objective of this course is to equip students with knowledge about natural language processing and information extraction techniques, and present a set of scenarios of real application of such techniques, demonstrating how to process large amounts of information available in different types of repositories, such as news, scientific articles, blogs, social networks, etc.. We will also present machine learning techniques for supervised and unsupervised classification, which fundamental in the development of language processing and information extraction systems.

Upon completing the course students should be able to:
- Explain the fundamental concepts and techniques in natural language processing and information extraction
- Demonstrate knowledge of relevant literature and be able to synthesize and present research work
- Design and implement systems that perform analysis and automatic extraction of information expressed in natural language or in a semi-structured format.

Program

The curricular unit will be organized in two parts that include the theoretical component and a project component. The theoretical component will introduce concepts on language processing and information extraction and discuss recent literature on the subject. The project component will allow students to apply these concepts in practical case studies. Students will perform research, development and evaluation of a solution of language processing and information extraction. During the research and development stages, students will be accompanied under the tutorship.

The course will address the following topics:
- Introduction to basic problems and language processing and related support resources.
- Presentation of techniques and typical applications of natural language processing and information extraction: named entity recognition, co-locations, POS tagging, automatic summarization, sentiment analysis, word-sense disambiguation, etc..
- Introduction to machine learning techniques for text classification and topic extraction (e.g., SVMs, Latent Dirichlet Allocation). Representation of documents: bag-of-words, n-grams.
- Processing of user-generated content and extraction of information in social networks (e.g., blogs, micro-blogs, etc..). Folksonomies, identification of topics, summarization, content recommendation.
- Extraction of semantic relations and named entity disambiguation using external resources (e.g., Wikipedia, Wordnet).
- Log analysis, pattern and trend detection; recommendations.

Mandatory literature

Christopher D. Manning And Hinrich Schütze; Foundations of Statistical Natural Language Processing, MIT-Press, 1999. ISBN: 0-262-13360-1

Teaching methods and learning activities

Students will have to attend lectures. Individual research work will be supported by the teacher on a one-to-one basis.

keywords

Technological sciences > Engineering > Computer engineering

Evaluation Type

Distributed evaluation without final exam

Eligibility for exams

In all evaluation components (SP1, SP2, FP1 and FP2) a minimum score of 7 out of 20 is required. For successfully obtaining a final grade, students must obtain a minimum score on the four components.

Calculation formula of final grade

The final grade (CF) is calculated as follows:

CF = (20% * SP1 + 10% * SP2) + (45% * FP1 + 25% * FP2)

Where (see below for a more detailed description):

- SP1: short-paper
- SP2: short presentation (10 minutes)

- FP1: full-paper
- FP2: public presentation (25 minutes) and demonstration of the project.

Examinations or Special Assignments

Each student will be given a set of problems, from which he/she will select one. The student will be graded according to how he/she achieves the corresponding solution. More specifically, in this course we will be evaluating:

1) how the student researches and compares solutions already proposed for the problem at hand;

2) how the student proposes and implements a (possibly original) solution to the problem;

3) how the student evaluates the solution he/she proposes;

4) how the student proposes improvements to the initial solution, and also how he/she implements and evaluates such improvements;

5) how the student communicates to others the solution he/she developed.

The evaluation willl include two components:

1) SP: "short-paper" - 30% of the final grade
2) FP: "full-paper" - 70% of the final grade

The SP component will be evaluated halfway through the semester and will consist of:

SP1: the student will have to prepare a “short-paper” describing the first experiments in tackling the selected problem.

SP2: short presentation (10 minutes) on the work done so far.

The FP component will be evaluated at the end of the semester and consists of:

FP1: as "full-paper" (written in English) containing a description of the final solution of the problem, and results of the evaluation experiments regarding the proposed solution.

FP2: public presentation (25 minutes) and demonstration of the project.

Special assessment (TE, DA, ...)

Students under special evaluation constraints are allowed to skip lectures. However, they still have to make the public presentations described in the previous section and the final grades will be given according to evaluation criteria already described.

Classification improvement

Only the end-of-semester evaluation (70%) can be subject to grade improvement. The student will have to resubmit a new research work (i.e. full-paper) and make the corresponding public presentation.

Recommend this page Top

Copyright 1996-2025 © Faculdade de Engenharia da Universidade do Porto I Terms and Conditions I Accessibility I Index A-Z I Guest Book
Page generated on: 2025-06-15 at 17:24:01 | Acceptable Use Policy | Data Protection Policy | Complaint Portal