Utilize este identificador para referenciar este registo: https://hdl.handle.net/1822/90084

TítuloScalable transcriptomics analysis with dask: applications in data science and machine learning
Autor(es)Moreno, Marta
Vilaça, Ricardo Manuel Pereira
Ferreira, Pedro G.
Palavras-chaveData Science
Python
Dask
Transcriptomics analysis
Machine learning
Scalable data science
Gene expression
Transcriptomics
Data analysis
Data30-Nov-2022
EditoraBMC
RevistaBMC Bioinformatics
Resumo(s)Background: Gene expression studies are an important tool in biological and biomedical research. The signal carried in expression profles helps derive signatures for the prediction, diagnosis and prognosis of diferent diseases. Data science and specifcally machine learning have many applications in gene expression analysis. However, as the dimensionality of genomics datasets grows, scalable solutions become necessary. Methods: In this paper we review the main steps and bottlenecks in machine learning pipelines, as well as the main concepts behind scalable data science including those of concurrent and parallel programming. We discuss the benefts of the Dask framework and how it can be integrated with the Python scientifc environment to perform data analysis in computational biology and bioinformatics. Results: This review illustrates the role of Dask for boosting data science applications in diferent case studies. Detailed documentation and code on these procedures is made available at https://github.com/martaccmoreno/gexp-ml-dask. Conclusion: By showing when and how Dask can be used in transcriptomics analysis, this review will serve as an entry point to help genomic data scientists develop more scalable data analysis procedures.
TipoArtigo
URIhttps://hdl.handle.net/1822/90084
DOI10.1186/s12859-022-05065-3
ISSN1471-2105
Versão da editorahttps://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-022-05065-3
Arbitragem científicayes
AcessoAcesso aberto
Aparece nas coleções:HASLab - Artigos em revistas internacionais

Ficheiros deste registo:
Ficheiro Descrição TamanhoFormato 
MVF22.pdf2,54 MBAdobe PDFVer/Abrir

Partilhe no FacebookPartilhe no TwitterPartilhe no DeliciousPartilhe no LinkedInPartilhe no DiggAdicionar ao Google BookmarksPartilhe no MySpacePartilhe no Orkut
Exporte no formato BibTex mendeley Exporte no formato Endnote Adicione ao seu ORCID