Next Generation Sequencing
Keywords |
Classification |
Keyword |
OFICIAL |
Biology |
Instance: 2018/2019 - 1S
Cycles of Study/Courses
Teaching language
Suitable for English-speaking students
Objectives
The aim of this course is to equip graduate students with advanced knowledge on NGS data analysis through bioinformatics and computational biology, both from a theoretical and practical point of view. The course is specially designed to help students interested in developing their careers in new sequencing technologies to solve biological problems related to proteins, genes, genomes and their interactions.
Learning outcomes and competences
Acquisition of knowledge about
Working with Unix command line and creation of small scripts for modifying files without needing to be opened.
Knowledge of techniques for cleaning third generation sequencing readings.
Mastery of assembling techniques of readings by mapping and de novo.
Analysis of RNA-Seq and transcriptomic data
Prediction and localization of genes within contigs and software used.
Functional annotation of proteins.
Working method
Presencial
Pre-requirements (prior knowledge) and co-requirements (common knowledge)
none
Program
I. Introduction to NGS. Library preparation and sequencing technologies.
Technology and methods behind each NGS platform
II. Introduction to command line tricks. Learning main commands to move across folders, create, delete, sort, move files in Unix system.
How to connect to a remote server to perform bigger analysis. Secure transfer between computers
Exercises with commands in the shell. Introduction to the stream editor SED, options and functionality.
Applied exercises with command SED. File modification without openning it.
Introduction to file editor AWK. Examples, integrated functions and conditions to apply to text files. Exercises
III. What is read data, visualising it and analysing quality scores. General overviews on how sequencing error effects:
- read mapping,
- variant detection,
- haplotype reconstruction,
- phylogenies
- de novo assembly
- Differential expression.
Case study: mapping algorithms in detail.
Practical: variant detection and the evolution of drug resistance in HIV-1 using mapped reads.
Variant detection following removal of sequencing error using a Bayesian framework.
Practical: applying the Bayesian framework.
IV. Novel ways to store data following error removal.
Novel ways at looking at genetic variation that don’t require mapping of reads e.g. Markov Models.
Transcriptome assembly vs genome assembly.
V. Three methods of transcriptome assembly in detail (and how error effect each of them):
- Reference based to genome.
- De novo assembly of mRNA.
- A hybrid approach.
Practical: preforming an assembly using the hybrid approach.
VI. The problem of Chimeras following assembly and how they arise (Network analysis).
Practical - removing chimeras from the data.
Differential expression
Experiments based on quality filtered read counts.
VII. Webs and forums to discuss problems with scripting.
Fastq format and meaning of the content information.
Post-sequencing pre-processing.
Quality filters applied to raw data and adaptor removal
Exercises
IX. Gene prediction in prokaryotes and eukaryotes. Basis of the prediction.
Prediction based in homology and
ab-initio methods (based in signals).
Available software and exercises
X. Gene functional annotation.
Terms databases, methodology and strategies.
Available software and exercises
Mandatory literature
Teresa K. Attwood;
Introduction to bioinformatics. ISBN: 0-582-32788-1
Complementary Bibliography
Tore Samuelsson; Genomics and Bioinformatics, Cambridge University Press, 2012. ISBN: 9781139022095
Teaching methods and learning activities
Classroom lessons
Activities directed and supervised by tutors.
Tasks/Exercises for applying acquiring knowledge
keywords
Natural sciences > Biological sciences > Biology > Genetics
Natural sciences > Biological sciences > Biology > Computational biology
Natural sciences > Biological sciences > Biology > Molecular biology
Evaluation Type
Distributed evaluation without final exam
Assessment Components
designation |
Weight (%) |
Trabalho prático ou de projeto |
90,00 |
Participação presencial |
10,00 |
Total: |
100,00 |
Amount of time allocated to each course unit
designation |
Time (hours) |
Frequência das aulas |
22,00 |
Trabalho de investigação |
20,00 |
Total: |
42,00 |
Eligibility for exams
Hours of student presence divided by the total number of hours of the course (≥ 80% assistance = frequencia)
Calculation formula of final grade
Sum of the grades obtained in the individual assignments submit by the student, divided by the total number of assignments distributed.
Examinations or Special Assignments
A assigment for each chaper of course program
Internship work/project
no
Special assessment (TE, DA, ...)
A assigment for each chaper of course program
Classification improvement
Modulation by class attendance
Observations
N.a