Go to:
Logótipo
Comuta visibilidade da coluna esquerda
Você está em: Start > Publications > View > Scalable transcriptomics analysis with Dask: applications in data science and machine learning
Publication

Publications

Scalable transcriptomics analysis with Dask: applications in data science and machine learning

Title
Scalable transcriptomics analysis with Dask: applications in data science and machine learning
Type
Article in International Scientific Journal
Year
2022
Authors
Moreno, M
(Author)
Other
The person does not belong to the institution. The person does not belong to the institution. The person does not belong to the institution. Without AUTHENTICUS Without ORCID
Vilaca, R
(Author)
Other
The person does not belong to the institution. The person does not belong to the institution. The person does not belong to the institution. View Authenticus page Without ORCID
Journal
Title: BMC BioinformaticsImported from Authenticus Search for Journal Publications
Vol. 23
Final page: 514
ISSN: 1471-2105
Publisher: Springer Nature
Other information
Authenticus ID: P-00X-QNH
Abstract (EN): Background: Gene expression studies are an important tool in biological and biomedical research. The signal carried in expression profiles helps derive signatures for the prediction, diagnosis and prognosis of different diseases. Data science and specifically machine learning have many applications in gene expression analysis. However, as the dimensionality of genomics datasets grows, scalable solutions become necessary. Methods: In this paper we review the main steps and bottlenecks in machine learning pipelines, as well as the main concepts behind scalable data science including those of concurrent and parallel programming. We discuss the benefits of the Dask framework and how it can be integrated with the Python scientific environment to perform data analysis in computational biology and bioinformatics. Results: This review illustrates the role of Dask for boosting data science applications in different case studies. Detailed documentation and code on these procedures is made available at https:// github. com/martaccmoreno/gexp-ml-dask. Conclusion: By showing when and how Dask can be used in transcriptomics analysis, this review will serve as an entry point to help genomic data scientists develop more scalable data analysis procedures.
Language: English
Type (Professor's evaluation): Scientific
No. of pages: 20
Documents
We could not find any documents associated to the publication.
Related Publications

Of the same journal

Surveying alignment-free features for Ortholog detection in related yeast proteomes by using supervised big data classifiers (2018)
Article in International Scientific Journal
Galpert, D; Fernandez, A; Herrera, F; Agostinho Antunes; Molina Ruiz, R; Aguero Chapin, G
SicknessMiner: a deep-learning-driven text-mining tool to abridge disease-disease associations (2021)
Article in International Scientific Journal
Rosario Ferreira, N; Guimaraes, V; Costa, VS; Moreira, IS
LOSITAN: A workbench to detect molecular adaptation based on a F(st)-outlier method (2008)
Article in International Scientific Journal
antao, t; lopes, a; lopes, rj; beja-pereira, a; luikart, g
LMAP_S: Lightweight Multigene Alignment and Phylogeny eStimation (2019)
Article in International Scientific Journal
Maldonado, E; Agostinho Antunes
LMAP: Lightweight Multigene Analyses in PAML (2016)
Article in International Scientific Journal
Maldonado, E; Almeida, D; Escalona, T; Khan, I; Vitor Vasconcelos; Agostinho Antunes

See all (7)

Recommend this page Top
Copyright 1996-2025 © Faculdade de Direito da Universidade do Porto  I Terms and Conditions  I Acessibility  I Index A-Z  I Guest Book
Page created on: 2025-07-07 at 06:38:55 | Acceptable Use Policy | Data Protection Policy | Complaint Portal