Go to:
Logótipo
You are in:: Start > BIOL4029

Next Generation Sequencing

Code: BIOL4029     Acronym: BIOL4029

Keywords
Classification Keyword
OFICIAL Biology

Instance: 2021/2022 - 1S Ícone do Moodle

Active? Yes
Responsible unit: Department of Biology
Course/CS Responsible: Master in Bioinformatics and Computational Biology

Cycles of Study/Courses

Acronym No. of Students Study Plan Curricular Years Credits UCN Credits ECTS Contact hours Total Time
E:BBC 1 PE_Bioinformatics and Computational Biology 1 - 6 42 162
M:BBC 13 The study plan since 2018 1 - 6 42 162

Teaching language

Suitable for English-speaking students

Objectives

The aim of this course is to equip graduate students with advanced knowledge on NGS data analysis through bioinformatics and computational biology, both from a theoretical and practical point of view. The course is specially designed to help students interested in developing their careers in new sequencing technologies to solve biological problems related to proteins, genes, genomes and their interactions.

Learning outcomes and competences

Acquisition of knowledge about
Working with Unix command line and creation of small scripts for modifying files without needing to be opened.
Knowledge of techniques for cleaning third generation sequencing readings.
Mastery of assembling techniques of readings by mapping and de novo.
Analysis of RNA-Seq and transcriptomic data
Prediction and localization of genes within contigs and software used.
Functional annotation of proteins.

Working method

Presencial

Pre-requirements (prior knowledge) and co-requirements (common knowledge)

none

Program

I. Introduction to NGS. Library preparation and sequencing technologies.
      Technology and methods behind each NGS platform

II. Introduction to command line tricks. Learning main commands to move across folders, create, delete, sort, move files in Unix system.
      How to connect to a remote server to perform bigger analysis. Secure transfer between computers

      Exercises with commands in the shell. Introduction to the stream editor SED, options and functionality.

      Applied exercises with command SED. File modification without openning it.

      Introduction to file editor AWK. Examples, integrated functions and conditions to apply to text files. Exercises

III. What is read data, visualising it and analysing quality scores. General overviews on how sequencing error effects:


  1. read mapping, 

  2. variant detection, 

  3. haplotype reconstruction, 

  4. phylogenies 

  5. de novo assembly 

  6. Differential expression.


Case study: mapping algorithms in detail.
Practical: variant detection and the evolution of drug resistance in HIV-1 using mapped reads.            
      Variant detection following removal of sequencing error using a Bayesian framework.
       Practical: applying the Bayesian framework. 

IV.  Novel ways to store data following error removal. 

      Novel ways at looking at genetic variation that don’t require mapping of reads e.g. Markov Models.
      Transcriptome assembly vs genome assembly.
 

V. Three methods of transcriptome assembly in detail (and how error effect each of them):


  1. Reference based to genome. 

  2. De novo assembly of mRNA. 

  3. A hybrid approach. 


Practical: preforming an assembly using the hybrid approach.
 

VI. The problem of Chimeras following assembly and how they arise (Network analysis).
        Practical - removing chimeras from the data.
        Differential expression
        Experiments based on quality filtered read counts.
 

VII. Webs and forums to discuss problems with scripting.

        Fastq format and meaning of the content information.

        Post-sequencing pre-processing.
        Quality filters applied to raw data and adaptor removal
         Exercises
 

IX. Gene prediction in prokaryotes and eukaryotes. Basis of the prediction.
        Prediction based in homology and ab-initio methods (based in signals).
        Available software and exercises
 

X. Gene functional annotation.
       Terms databases, methodology and strategies.
         Available software and exercises

Mandatory literature

Teresa K. Attwood; Introduction to bioinformatics. ISBN: 0-582-32788-1

Complementary Bibliography

Tore Samuelsson; Genomics and Bioinformatics, Cambridge University Press, 2012. ISBN: 9781139022095

Teaching methods and learning activities

Classroom lessons
Activities directed and supervised by tutors.
Tasks/Exercises for applying acquiring knowledge

keywords

Natural sciences > Biological sciences > Biology > Genetics
Natural sciences > Biological sciences > Biology > Computational biology
Natural sciences > Biological sciences > Biology > Molecular biology

Evaluation Type

Distributed evaluation without final exam

Assessment Components

designation Weight (%)
Trabalho prático ou de projeto 90,00
Participação presencial 10,00
Total: 100,00

Amount of time allocated to each course unit

designation Time (hours)
Frequência das aulas 22,00
Trabalho de investigação 20,00
Total: 42,00

Eligibility for exams

Hours of student presence divided by the total number of hours of the course (≥ 80% assistance = frequencia)

Calculation formula of final grade

Sum of the grades obtained in the individual assignments submit by the student, divided by the total number of assignments distributed.

Examinations or Special Assignments

A assigment for each chaper of course program

Internship work/project

no

Special assessment (TE, DA, ...)

A assigment for each chaper of course program

Classification improvement

Modulation by class attendance

Observations

N.a
Recommend this page Top
Copyright 1996-2025 © Faculdade de Ciências da Universidade do Porto  I Terms and Conditions  I Acessibility  I Index A-Z  I Guest Book
Page created on: 2025-06-14 at 09:43:48 | Acceptable Use Policy | Data Protection Policy | Complaint Portal