Literature DB >> 35269624

Augmentation of Transcriptomic Data for Improved Classification of Patients with Respiratory Diseases of Viral Origin.

Magdalena Kircher1, Elisa Chludzinski2, Jessica Krepel1, Babak Saremi1, Andreas Beineke2, Klaus Jung1.   

Abstract

To better understand the molecular basis of respiratory diseases of viral origin, high-throughput gene-expression data are frequently taken by means of DNA microarray or RNA-seq technology. Such data can also be useful to classify infected individuals by molecular signatures in the form of machine-learning models with genes as predictor variables. Early diagnosis of patients by molecular signatures could also contribute to better treatments. An approach that has rarely been considered for machine-learning models in the context of transcriptomics is data augmentation. For other data types it has been shown that augmentation can improve classification accuracy and prevent overfitting. Here, we compare three strategies for data augmentation of DNA microarray and RNA-seq data from two selected studies on respiratory diseases of viral origin. The first study involves samples of patients with either viral or bacterial origin of the respiratory disease, the second study involves patients with either SARS-CoV-2 or another respiratory virus as disease origin. Specifically, we reanalyze these public datasets to study whether patient classification by transcriptomic signatures can be improved when adding artificial data for training of the machine-learning models. Our comparison reveals that augmentation of transcriptomic data can improve the classification accuracy and that fewer genes are necessary as explanatory variables in the final models. We also report genes from our signatures that overlap with signatures presented in the original publications of our example data. Due to strict selection criteria, the molecular role of these genes in the context of respiratory infectious diseases is underlined.

Entities:  

Keywords:  SARS-CoV-2; data augmentation; deep learning; generative adversarial networks; high-dimensional data; transcriptomic data; viral acute respiratory illness

Mesh:

Year:  2022        PMID: 35269624      PMCID: PMC8910329          DOI: 10.3390/ijms23052481

Source DB:  PubMed          Journal:  Int J Mol Sci        ISSN: 1422-0067            Impact factor:   5.923


  33 in total

1.  Support vector machine classification and validation of cancer tissue samples using microarray expression data.

Authors:  T S Furey; N Cristianini; N Duffy; D W Bednarski; M Schummer; D Haussler
Journal:  Bioinformatics       Date:  2000-10       Impact factor: 6.937

Review 2.  Deep learning in bioinformatics.

Authors:  Seonwoo Min; Byunghan Lee; Sungroh Yoon
Journal:  Brief Bioinform       Date:  2017-09-01       Impact factor: 11.622

3.  A simulation framework for correlated count data of features subsets in high-throughput sequencing or proteomics experiments.

Authors:  Jochen Kruppa; Frank Kramer; Tim Beißbarth; Klaus Jung
Journal:  Stat Appl Genet Mol Biol       Date:  2016-10-01

Review 4.  Viral Proteins Recognized by Different TLRs.

Authors:  Rui Zhou; Li Liu; Yu Wang
Journal:  J Med Virol       Date:  2021-08-10       Impact factor: 2.327

5.  A diverse range of gene products are effectors of the type I interferon antiviral response.

Authors:  John W Schoggins; Sam J Wilson; Maryline Panis; Mary Y Murphy; Christopher T Jones; Paul Bieniasz; Charles M Rice
Journal:  Nature       Date:  2011-04-10       Impact factor: 49.962

6.  Gene selection and classification of microarray data using random forest.

Authors:  Ramón Díaz-Uriarte; Sara Alvarez de Andrés
Journal:  BMC Bioinformatics       Date:  2006-01-06       Impact factor: 3.169

Review 7.  Molecular pathology of emerging coronavirus infections.

Authors:  Lisa E Gralinski; Ralph S Baric
Journal:  J Pathol       Date:  2015-01       Impact factor: 7.996

8.  Characterization of cellular transcriptomic signatures induced by different respiratory viruses in human reconstituted airway epithelia.

Authors:  Claire Nicolas de Lamballerie; Andrés Pizzorno; Julia Dubois; Thomas Julien; Blandine Padey; Mendy Bouveret; Aurélien Traversier; Catherine Legras-Lachuer; Bruno Lina; Guy Boivin; Olivier Terrier; Manuel Rosa-Calatrava
Journal:  Sci Rep       Date:  2019-08-07       Impact factor: 4.379

9.  Realistic in silico generation and augmentation of single-cell RNA-seq data using generative adversarial networks.

Authors:  Mohamed Marouf; Pierre Machart; Vikas Bansal; Christoph Kilian; Daniel S Magruder; Christian F Krebs; Stefan Bonn
Journal:  Nat Commun       Date:  2020-01-09       Impact factor: 14.919

10.  Transcriptomes of peripheral blood mononuclear cells from juvenile dermatomyositis patients show elevated inflammation even when clinically inactive.

Authors:  Elisha D O Roberson; Rosana A Mesa; Gabrielle A Morgan; Li Cao; Wilfredo Marin; Lauren M Pachman
Journal:  Sci Rep       Date:  2022-01-07       Impact factor: 4.379

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.