Computational Analysis of Molecular Data
| Keywords |
| Classification |
Keyword |
| OFICIAL |
Biology |
Instance: 2016/2017 - 2S
Cycles of Study/Courses
Teaching language
English
Objectives
The aim of this curricular unit is to provide training in conceptual and practical aspects of data analysis of population genomics datasets, with emphasis on applications. This includes an introduction on the main concepts of coalescent, Bayesian, approximate Bayesian (ABC), and likelihood-based approaches. Emphasis will be on interpretation of output from statistical approaches and software programs. The main computational methods to analyze molecular data from high-throughput sequencing and to estimate demography and selection will also be taught.
Learning outcomes and competences
This curricular unit addresses current methodologies of computational analysis using sequence data. The advances in parallel, high-throughput sequencing technologies boosted the generation of sequence data, prompting for new ways of dealing with large volumes of information. In this curricular unit, focus will be put into current statistical methods that are best suited to particular questions but, more importantly, on the understanding of data types/formats, models and underlying algorithms. As the development of computational analysis methods for molecular data is rapidly growing, students will be able to cope with the newest progress in bioinformatics tools, as the curricular unit emphasizes fundamental concepts rather than particular software.
Working method
Presencial
Program
- Overview and introduction to statistical approaches in population genetics
- Frequentist, likelihood, and Bayesian approaches
- The theory of coalescence
- Estimation of effective populacional (Ne) and Approximate Bayesian Methods
- Sequence data analysis and quality scores
- Combining genetics and demography to assess dispersal (and detect selection)
- Detecting selection: FST-outliers and local adaptation
- Landscape genetics and spatial statistics
- Analysis using high-throughput sequencing
- RAD sequencing, Short read sequence analysis and SNP detection
- Data mining. File formats and conversion. Fetching data through APIs.
- Bootstrap, Jackknife and permutation methods
- Scripting languages (Python, Perl and R)
Mandatory literature
Gascuel O; Mathematics of Evolution and Phylogeny, Oxford University Press, 2007. ISBN: 978-0199231348
Manly BJF; Randomization, Bootstrap and Monte Carlo Methods in Biology, Chapman & Hall, 2006. ISBN: 978-1584885412
Teaching methods and learning activities
Theoretical classes, laboratory work (computational analysis).
Evaluation Type
Distributed evaluation with final exam
Assessment Components
| designation |
Weight (%) |
| Exame |
50,00 |
| Participação presencial |
20,00 |
| Trabalho escrito |
30,00 |
| Total: |
100,00 |
Eligibility for exams
Conclusion of the laboratory work and respective report.
Attendance of a minimum of 50% of the theoretical course.
Calculation formula of final grade
Report of the laboratory practical work (30/100), written test on topics covered in the theoretical component of the discipline (50/100);
Attendance and participation in class (20/100).