Assessment of Supervised Classification Methods for the Analysis of RNA-seq Data

authors

  • Abu El Qumsan Mustafa

keywords

  • Bioinformatics Biostatistics Séquençage Massivement Parallèle RNA-Seq Supervised classification

document type

THESE

abstract

Since a decade, “Next Generation Sequencing” (NGS) technologies enabled to characterize genomic sequences at an unprecedented pace. Many studies focused of human genetic diversity and on transcriptome (the part of genome transcribed into ribonucleic acid). Indeed, different tissues of our body express different genes at different moments, enabling cell differentiation and functional response to environmental changes. Since many diseases affect gene expression, transcriptome profiles can be used for medical purposes (diagnostic and prognostic). A wide variety of advanced statistical and machine learning methods have been proposed to address the general problem of classifying individuals according to multiple variables (e.g. transcription level of thousands of genes in hundreds of samples). During my thesis, I led a comparative assessment of machine learning methods and their parameters, to optimize the accuracy of sample classification based on RNA-seq transcriptome profiles.

more information