| Literature DB >> 35818047 |
Jingyang Niu1, Jing Yang1, Yuyu Guo1, Kun Qian1, Qian Wang2.
Abstract
BACKGROUND: Metabolomics is a primary omics topic, which occupies an important position in both clinical applications and basic researches for metabolic signatures and biomarkers. Unfortunately, the relevant studies are challenged by the batch effect caused by many external factors. In last decade, the technique of deep learning has become a dominant tool in data science, such that one may train a diagnosis network from a known batch and then generalize it to a new batch. However, the batch effect inevitably hinders such efforts, as the two batches under consideration can be highly mismatched.Entities:
Keywords: Batch effect; Deep learning; Diagnostic accuracy; Metabolomics
Mesh:
Year: 2022 PMID: 35818047 PMCID: PMC9275160 DOI: 10.1186/s12859-022-04758-z
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.307
MMDs of the CyTOF data before and after being calibrated by individual methods
| Raw | ComBat | Ratio_G | fSVA | ResNet | NormAE | Ours | In Batch 1 | In Batch 2 | |
|---|---|---|---|---|---|---|---|---|---|
| Patient 1 | 0.243 ± 0.010 | 0.167 ± 0.006 | 0.243 ± 0.010 | – | 0.116 ± 0.008 | 0.113 ± 0.005 | 0.067 ± 0.005 | 0.053 ± 0.005 | 0.057 ± 0.005 |
| Patient 2 | 0.230 ± 0.010 | 0.139 ± 0.005 | 0.230 ± 0.010 | 0.158 ± 0.007 | 0.120 ± 0.009 | 0.131 ± 0.007 | 0.092 ± 0.005 | 0.065 ± 0.006 | 0.064 ± 0.005 |
“Raw” represents the MMD value of the source and target batches before calibration. “—” means that the fSVA method has collapsed due to numerical singularity in our computation. “In Batch 1 (or 2)” represents the intrinsic MMD without being corrupted by batch effect inside each batch
Fig. 1Visualization of the public CyTOF data of Patient 1. In (A) and (B), different colors highlight the two batches (Days). In (C) and (D), different colors identify true labels of the samples
Classification results on the public CyTOF data
| Source | Target | ||||
|---|---|---|---|---|---|
| Before calibration | After calibration | ||||
| Day 1 | Day 2 | Day 1 | Day 2 | ||
| ACC | Day 1 | 0.962 | 0.939 | 0.962 | 0.951 |
| Day 2 | 0.947 | 0.961 | 0.964 | 0.961 | |
| F-score | Day 1 | 0.931 | 0.885 | 0.931 | 0.909 |
| Day 2 | 0.907 | 0.931 | 0.935 | 0.931 | |
| AUC | Day 1 | 0.962 | 0.911 | 0.962 | 0.928 |
| Day 2 | 0.958 | 0.961 | 0.968 | 0.961 | |
| MCC | Day 1 | 0.906 | 0.845 | 0.906 | 0.877 |
| Day 2 | 0.875 | 0.904 | 0.911 | 0.904 | |
| ACC | Day 1 | 0.985 | 0.973 | 0.985 | 0.975 |
| Day 2 | 0.973 | 0.982 | 0.978 | 0.982 | |
| F-score | Day 1 | 0.939 | 0.901 | 0.939 | 0.905 |
| Day 2 | 0.895 | 0.934 | 0.908 | 0.934 | |
| AUC | Day 1 | 0.976 | 0.951 | 0.976 | 0.937 |
| Day 2 | 0.963 | 0.966 | 0.947 | 0.966 | |
| MCC | Day 1 | 0.931 | 0.885 | 0.931 | 0.891 |
| Day 2 | 0.881 | 0.924 | 0.895 | 0.924 | |
When the source and target indices are the same, the reported metrics are for the in-batch classification by tenfold cross-validation
MMDs of the SLE data before and after being calibrated by individual methods
| Source | Target | Raw | ComBat | Ratio_G | fSVA | ResNet | NormAE | Ours | In Source |
|---|---|---|---|---|---|---|---|---|---|
| 1 | 2 | 0.217 ± 0.010 | 0.125 ± 0.008 | 0.217 ± 0.009 | 0.285 ± 0.013 | 0.153 ± 0.010 | 0.202 ± 0.012 | 0.071 ± 0.008 | 0.053 ± 0.005 |
| 3 | 0.696 ± 0.016 | 0.134 ± 0.009 | 0.292 ± 0.008 | 0.351 ± 0.014 | 0.144 ± 0.008 | 0.166 ± 0.008 | 0.072 ± 0.007 | ||
| 2 | 1 | 0.217 ± 0.010 | 0.125 ± 0.008 | 0.221 ± 0.010 | 0.214 ± 0.011 | 0.153 ± 0.010 | 0.202 ± 0.012 | 0.073 ± 0.008 | 0.062 ± 0.005 |
| 3 | 0.623 ± 0.016 | 0.147 ± 0.009 | 0.245 ± 0.010 | 0.242 ± 0.010 | 0.176 ± 0.010 | 0.145 ± 0.008 | 0.069 ± 0.006 | ||
| 3 | 1 | 0.696 ± 0.016 | 0.134 ± 0.009 | 0.291 ± 0.008 | 0.289 ± 0.012 | 0.144 ± 0.008 | 0.166 ± 0.008 | 0.074 ± 0.008 | 0.064 ± 0.004 |
| 2 | 0.623 ± 0.016 | 0.147 ± 0.009 | 0.240 ± 0.010 | 0.223 ± 0.008 | 0.176 ± 0.010 | 0.145 ± 0.008 | 0.070 ± 0.007 |
The first two columns present the data combinations of different batches participating in the comparison. The other columns are similar to the comparisons in Table 1
Fig. 2Visualizations of Batch 3 as the source and Batch 2 as the target of the private MALDI MS data. A and B are colored by batch indices. In (C) and (D), the samples are colored by disease labels (Class 1: SLE; Class 0: control)
Fig. 3a Venn diagram about the number of metabolite peak intersections within three batches as potential biomarkers with model selection frequency > 90% and p < 0.05. b Boxplots of six common m/z features that reflect significant differences for the case and control groups. c Permutation test of random labels for batch 1 as source and batch 2 as target
Classification results on the MALDI MS data
| Source | Target | ||||||
|---|---|---|---|---|---|---|---|
| Before calibration | After calibration | ||||||
| 1 | 2 | 3 | 1 | 2 | 3 | ||
| ACC | 1 | 0.926 | 0.753 | 0.813 | 0.926 | 0.889 | 0.879 |
| 2 | 0.799 | 0.911 | 0.828 | 0.875 | 0.911 | 0.870 | |
| 3 | 0.876 | 0.763 | 0.927 | 0.907 | 0.884 | 0.927 | |
| F-score | 1 | 0.915 | 0.807 | 0.828 | 0.915 | 0.904 | 0.875 |
| 2 | 0.814 | 0.919 | 0.823 | 0.867 | 0.919 | 0.863 | |
| 3 | 0.865 | 0.741 | 0.929 | 0.904 | 0.899 | 0.929 | |
| AUC | 1 | 0.923 | 0.729 | 0.813 | 0.923 | 0.882 | 0.879 |
| 2 | 0.809 | 0.912 | 0.828 | 0.875 | 0.912 | 0.870 | |
| 3 | 0.874 | 0.786 | 0.927 | 0.909 | 0.879 | 0.927 | |
| MCC | 1 | 0.857 | 0.505 | 0.637 | 0.857 | 0.774 | 0.758 |
| 2 | 0.648 | 0.831 | 0.656 | 0.749 | 0.831 | 0.743 | |
| 3 | 0.750 | 0.593 | 0.866 | 0.816 | 0.764 | 0.866 | |
| ACC | 1 | 0.922 | 0.769 | 0.822 | 0.922 | 0.896 | 0.892 |
| 2 | 0.791 | 0.909 | 0.832 | 0 .891 | 0.909 | 0.876 | |
| 3 | 0.871 | 0.769 | 0.925 | 0.921 | 0.892 | 0.925 | |
| F-score | 1 | 0.911 | 0.821 | 0.837 | 0.911 | 0.910 | 0.890 |
| 2 | 0.814 | 0.918 | 0.827 | 0.883 | 0.918 | 0.870 | |
| 3 | 0.862 | 0.749 | 0.926 | 0.916 | 0.905 | 0.926 | |
| AUC | 1 | 0.919 | 0.744 | 0.822 | 0.919 | 0.889 | 0.892 |
| 2 | 0.802 | 0.909 | 0.832 | 0.890 | 0.909 | 0.875 | |
| 3 | 0.870 | 0.793 | 0.925 | 0.921 | 0.888 | 0.925 | |
| MCC | 1 | 0.848 | 0.539 | 0.658 | 0.848 | 0.789 | 0.784 |
| 2 | 0.636 | 0.827 | 0.666 | 0.780 | 0.827 | 0.753 | |
| 3 | 0.740 | 0.608 | 0.858 | 0.841 | 0.779 | 0.858 | |
The top half is conducted in the sample level, and the bottom half in the subject level. When the source and target IDs are the same, we perform in-batch cross-validation, whose results are free of batch effect
Comparison of diagnosis accuracy with one source batch for training and another target batch for testing
| Source | Target | Baseline | ComBat | Ratio_G | fSVA | ResNet | NormAE | Remove_R | Ours |
|---|---|---|---|---|---|---|---|---|---|
| 1 | 2 | 0.753 | 0.778 | 0.798 | 0.773 | 0.791 | 0.805 | 0.852 | |
| 1 | 3 | 0.813 | 0.797 | 0.858 | 0.836 | 0.803 | 0.812 | 0.856 | |
| 2 | 1 | 0.799 | 0.817 | 0.821 | 0.857 | 0.824 | 0.827 | 0.839 | |
| 2 | 3 | 0.828 | 0.851 | 0.818 | 0.829 | 0.852 | 0.866 | 0.833 | |
| 3 | 1 | 0.876 | 0.861 | 0.864 | 0.854 | 0.868 | 0.889 | 0.863 | |
| 3 | 2 | 0.763 | 0.759 | 0.754 | 0.824 | 0.805 | 0.799 | 0.821 | |
| Overall | 0.805 | 0.811 | 0.819 | 0.829 | 0.824 | 0.833 | 0.844 | ||
“Baseline” denotes classification based on raw input data without any calibration for batch effect removal
Fig. 4The architecture of the proposed deep learning framework for joint batch effect removal and classification. The source batch and the target batch are processed through the same calibrator , such that both batches are compactly distributed in the latent space. The source batch supervises the training of the discriminator , which then predicts the labels for the target batch in testing. Two reconstructors, and , are used to ensure that the input data can be fully recovered from latent encoding