Go to:
You are in:: Start > CC4047
Site map
FC6 - Departamento de Ciência de Computadores FC5 - Edifício Central FC4 - Departamento de Biologia FC3 - Departamento de Física e Astronomia e Departamento GAOT FC2 - Departamento de Química e Bioquímica FC1 - Departamento de Matemática

Algorithms for Bioinformatics

Code: CC4047     Acronym: CC4047

Classification Keyword
OFICIAL Computer Science

Instance: 2020/2021 - 2S Ícone do Moodle Ícone  do Teams

Active? Yes
Responsible unit: Department of Computer Science
Course/CS Responsible: Master's degree in Bioinformatics and Computational Biology

Cycles of Study/Courses

Acronym No. of Students Study Plan Curricular Years Credits UCN Credits ECTS Contact hours Total Time
E:BBC 3 PE_Bioinformatics and Computational Biology 1 - 6 42 162
M:BBC 19 The study plan since 2018 1 - 6 42 162

Teaching Staff - Responsibilities

Teacher Responsibility
Pedro Gabriel Dias Ferreira

Teaching - Hours

Theoretical and practical : 3,00
Type Teacher Classes Hour
Theoretical and practical Totals 1 3,00
Pedro Gabriel Dias Ferreira 3,00
Mais informaçõesLast updated on 2021-03-08.

Fields changed: Calculation formula of final grade

Teaching language

Suitable for English-speaking students


Bioinformatics is an interdisciplinary field that combines the fields of computer science, biology and biomedical science and statistics. Bioinformatics is devoted to the application and development of new computational methods for expanding the use of biological, biomedical or epidemiological data. Recent developments in high-throughput technologies have led to a real revolution in the biological and biomedical research with bioinformatics playing a central role in the analysis of massive amounts of data. This course will focus on the main algorithms developed to address Bioinformatics tasks. An emphasis will be made on algorithms for sequence processing and analysis, both on nucleic-acids or amino-acid sequences.

Our goal is that students will be able to understand how these algorithms work and how this can be developed and applied to address new computational tasks in biological sequence analysis.

Learning outcomes and competences

By the end of this course it is expected that the student

  1. Is familiarized with the main concepts of Bioinformatics including the main concepts on Computational Molecular Biology;
  2. Identifies the main sources of biological sequence data (e.g. nucleotide or amino-acid sequences; motifs and domains) and associated types and how can they be represented from a computational point of view.
  3. Understands different problems related to sequence analysis and which are the most adequate algorithms and data structures to solve these problems.
  4. Identify the advantages and disadvantages of each method from an algorithmic point of view. Emphasis will be given to methods on basic sequence processing tasks, transcription and translation, pairwise and multiple sequence alignment, sequence pattern finding, phylogenetic analysis from sequences and graphs and biological networks.
  5. Have a perspective of Bioinformatics as a field of critical importance to leverage biological, biomedical and health research and as a field of constant and fast-paced development.

Working method

À distância

Pre-requirements (prior knowledge) and co-requirements (common knowledge)



  1. Overview of Molecular Biology Fundamental Concepts;

  2. Introduction to Python programming language;

  3. Basic Processing of Biological Sequence;

  4. Finding Patterns in Sequences;

  5. Pairwise Sequence Alignment;

  6. Searching Similar Sequences in Databases;

  7. Multiple Sequence Alignment;

  8. Clustering and Trees;

  9. Probabilistic Motifs;

  10. Graphs and Biological Networks;

Mandatory literature

Miguel Rocha and Pedro G. Ferreira; Bioinformatics Algorithms(1st Edition): Design and Implementation in Python., Elsevier, 2018. ISBN: 9780128125205 (https://www.elsevier.com/books/bioinformatics-algorithms/rocha/978-0-12-812520-5)

Complementary Bibliography

Neil C. Jones and Pavel A. Pevzner ; An Introduction to Bioinformatics Algorithms (Computational Molecular Biology) 1st Edition. . ISBN: 0262101068 (https://www.amazon.com/Introduction-Bioinformatics-Algorithms-Computational-Molecular/dp/0262101068)
Sebastian Bassi; Python for Bioinformatics, CRC Press, 2016
R. Durbin, S. Eddy, A. Krogh, G. Mitchison; iological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids, Cambridge University Press, 1998
B. Alberts, A. Johnson, J. Lewis, M. Raff, K. Roberts, P. Walter; Molecular Biology of the Cell, 4th edition, Garland Science, 2002
The official home of the Python Programming Language; The python language website (http://www.python.org/)
Philip Guo; Python tutor, visualization of code execution (http://pythontutor.com/)
Dan Gusfield; Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology, Cambridge University Press, 1997
Python Software Foundation; The python tutorial (https://docs.python.org/3/tutorial/)
Stephen F. Altschul, Warren Gish, Webb Miller, Eugene W. Myers, David J. Lipman; Basic local alignment search tool, Journal of Molecular Biology 215 (3) (1990) 403–410.
Stephen F. Altschul, Thomas L. Madden, Alejandro A. Schäffer, Jinghui Zhang, Zheng Zhang, Webb Miller, David J. Lipman; Gapped blast and psi-blast: a new generation of protein database search programs, Nucleic Acids Research 25 (17) (1997) 3389–3402.
T.L. Bailey, C. Elkan; itting a mixture model by expectation maximization to discover motifs in biopolymers, Proceedings. International Conference on Intelligent Systems for Molecular Biology 2 (1994) 28–36.
Robert S. Boyer, J. Strother Moore; A fast string searching algorithm, Communications of the ACM 20 (10) (October 1977) 762–772.
Humberto Carrillo, David Lipman; The multiple sequence alignment problem in biology, SIAM Journal on Applied Mathematics 48 (5) (1988) 1073–1082.
M.K. Das, H.K. Dai; A survey of DNA motif finding algorithms, BMC Bioinformatics 8 (Suppl 7) (Nov 2007) S21.
Desmond G. Higgins, Paul M. Sharp; Clustal: a package for performing multiple sequence alignment on a microcomputer, Gene 73 (1) (1988) 237–244.
P. D’haeseleer; What are DNA sequence motifs? Nature Biotechnology 24 (4) (Apr 2006) 423–425.

Teaching methods and learning activities

  • Theoretical classes: expository, accompanied by examples

  • Practical classes: implementation (in Python) of algorithms and practical assignments support.



Evaluation Type

Distributed evaluation with final exam

Assessment Components

designation Weight (%)
Exame 50,00
Participação presencial 5,00
Trabalho prático ou de projeto 45,00
Total: 100,00

Amount of time allocated to each course unit

designation Time (hours)
Estudo autónomo 120,00
Frequência das aulas 42,00
Total: 162,00

Eligibility for exams

A minimum grade of 8 is required in the exam and test and in the pratical assignments.

Calculation formula of final grade

Grading criteria:
The UC grade is determined based on satisfactory progress (minimum grade of 8) in weekly assignments and final exam.

The breakdown of these components is as follows:

P: Assignments: 40% (practical and theoretical assignments). To perform individually or in group. To be submitted during the week of the class.
E: Final exam: 60%

Final Grade = P x 0.40 + 0.6 x E

Component E can be improved. Componente P cannot be improved.


The exam will be realized in presential mode.
Recommend this page Top
Copyright 1996-2022 © Faculdade de Ciências da Universidade do Porto  I Terms and Conditions  I Acessibility  I Index A-Z  I Guest Book
Page created on: 2022-09-25 at 06:22:39 | Reports Portal