Big Data and Cloud Computing
Keywords |
Classification |
Keyword |
OFICIAL |
Computer Science |
Instance: 2022/2023 - 2S 
Cycles of Study/Courses
Teaching language
English
Objectives
Introduction to the use of cloud computing infrastructures for processing massive amounts of data ("big data") in real-world problems.
Learning outcomes and competences
- Use of cloud computing services for big data applications.
- Programming big data applications using cloud programming models.
- Understanding of core fundaments and algorithms for mining big data.
- Hands-on practice with state-of-the-art tools for cloud computing and big data.
Working method
Presencial
Program
- Introduction to big data processing: challenges, example problems from science and business.
- The cloud computing paradigm: service models (PaaS, SaaS, IaaS); service virtualization, deployment and orchestration; integration of computing, networking and storage resources; scalability, fault-tolerance, and “elasticity”.
- Cloud storage solutions for big data: cloud file systems, NoSQL and graph-based databases, “object stores”.
- High-performance big data applications using cloud programming models: MapReduce, stream-based programming.
- Programming assignments on big data applications on specific topics such as data streams, social-network graphs, recommendation systems, or bioinformatics.
Mandatory literature
Ian Foster and Dennis B. Gannon; Cloud Computing for Science and Engineering, MIT Press, 2017. ISBN: 978-0262037242
Complementary Bibliography
Tom White; Hadoop, The Definitive Guide, 4th edition, O'Reilly Media, 2015. ISBN: 978-1491901632
N. Marz and J. Warren; Big Data: Principles and best practices of scalable realtime data systems,, Manning Publications, 2015. ISBN: 978-1617290343
Dan C. Marinescu; Cloud Computing - Theory and Practice, 2nd edition, Morgan Kaufmann, 2018. ISBN: 978-0-12-812810-7
Jure Leskovec, Anand Rajaraman, Jeff Ullman ; Mining of Massive Datasets, Cambridge University Press, 2014. ISBN: 978-1107077232 (Available free in PDF format by the authors at http://mmds.org)
M. Zaharia and B. Chambers; Spark: The Definitive Guide - Big Data Processing Made Simple, O'Reilly, 2018. ISBN: 978-1491912218
Teaching methods and learning activities
- Introduction of cloud computing technologies in tandem with big data application requirements.
- Hands-on practice in programming projects using tools by major cloud service providers (Amazon Web Services, Microsoft Azure, Google Cloud, etc) and DCC computer clusters for MapReduce.
Evaluation Type
Distributed evaluation with final exam
Assessment Components
designation |
Weight (%) |
Trabalho prático ou de projeto |
40,00 |
Teste |
60,00 |
Total: |
100,00 |
Amount of time allocated to each course unit
designation |
Time (hours) |
Elaboração de projeto |
52,00 |
Frequência das aulas |
52,00 |
Estudo autónomo |
58,00 |
Total: |
162,00 |
Eligibility for exams
--
Calculation formula of final grade
The grade is determined by two evaluation:
- 2 tests (T1 e T2) with a weight of 60% (12 = 6+6 "valores")
- project assignments (TP) with a weight of 40% (8 "valores")
To pass, students must have:
- a minimum grade of 8 "valores" in T1 and T2: T1 >= 8 and T2 >= 8
- a final grade of 0.3 T1 + 0.3 T2 + 0.4 TP >= 9.5 "valores"
If students are unable to pass in the conditions stated above, they can take a final exam where:
- The exam has 2 components corresponding to the material for tests T1 (E1) and T2 (E2).
- Students may opt to take one or both components. The best grade of each repeated component is considered for the final grade (E1 ou T1, E2 ou T2), with the requirement of a minimal grade of 8 valores.
- The final grade is given by N = 0.3 max(T1,E1) + 0.3 max(T2,E2) + 0.4 TP. Students pass if N >= 9.5, max(T1,E1) >= 8 and max(T2,E2) >= 8.
Classification improvement
The grade can be improved in the final exam.
The grade of project assignments cannot be improved.