| Literature DB >> 34289836 |
Marta Gallego-Paüls1,2,3, Carles Hernández-Ferrer1,2,3, Mariona Bustamante1,2,3,4, Xavier Basagaña1,2,3, Jose Barrera-Gómez1,2,3, Chung-Ho E Lau5,6, Alexandros P Siskos7, Marta Vives-Usano1,2,3,4, Carlos Ruiz-Arenas1,2,3, John Wright8, Remy Slama9, Barbara Heude10, Maribel Casas1,2,3, Regina Grazuleviciene11, Leda Chatzi12, Eva Borràs2,4, Eduard Sabidó2,4, Ángel Carracedo13,14, Xavier Estivill4, Jose Urquiza1,2,3, Muireann Coen6,15, Hector C Keun7, Juan R González1,2,3, Martine Vrijheid1,2,3, Léa Maitre16,17,18.
Abstract
BACKGROUND: Multiple omics technologies are increasingly applied to detect early, subtle molecular responses to environmental stressors for future disease risk prevention. However, there is an urgent need for further evaluation of stability and variability of omics profiles in healthy individuals, especially during childhood.Entities:
Keywords: Children; Cross-omics; DNA methylation; Exposome; Metabolomics; Multi-omics; Population study; Variability; mRNA; miRNA
Year: 2021 PMID: 34289836 PMCID: PMC8296694 DOI: 10.1186/s12916-021-02027-z
Source DB: PubMed Journal: BMC Med ISSN: 1741-7015 Impact factor: 8.775
Fig. 1Study workflow
Omics data description and technical variability management
| Omics profile | Matrix | Sample size (omics available for both visits) | Number of features | Laboratory processing | Batch correction | Criteria for feature exclusion |
|---|---|---|---|---|---|---|
| Blood leukocytes | 149 | 91601 | Randomized by cohort and sex, and panel samples paired in plate and array | Residuals of SVs protecting for cohort, sex and age. Cell type composition also corrected with SVs. | < 98% call rate and < 62.5% ICC | |
| Whole blood | 105 | 45438 | Randomized by cohort and sex, and panel samples paired in plate. | Residuals of SVs protecting for cohort, sex and age. Cell type composition also corrected with SVs. | < 25% call rate | |
| Whole blood | 100 | 453 | Randomized by cohort and sex, and panel samples paired in plate and array. | Residuals of SVs protecting for cohort, sex and age. Cell type composition also corrected with SVs. | < 25% call rate | |
| Plasma | 149 | 36 | Randomized by cohort | Overall protein average minus plate specific protein average subtracted for each individual and each protein | < 30% measurements in the linear range (LIN) | |
| Serum | 154 | 177 | Fully randomized | - | > 30% CV and > 30% BLD + zeros | |
| Urine | 154 | 44 | Fully randomized | - | > 30% CV |
Definitions. Call rate (for DNA methylome): proportion of detection of a given CpG among samples; Call rate (for miRNA and gene expression): proportion of detection of gene or miRNA among samples. Abbreviations. SV surrogate variables, ICC interclass correlation coefficient [35], CV coefficient of variation, BLD below limit of detection
Population description (N=156)
| Start of the study | ||
|---|---|---|
| Male | 89 | |
| Female | 67 | |
| European ancestry | 145 | |
| Pakistani | 10 | |
| Other | 1 | |
| BIB, UK | 28 | |
| EDEN, France | 28 | |
| KANC, Lithuania | 30 | |
| RHEA, Greece | 30 | |
| INMA, Spain | 40 | |
| Total | 7.8 (1.7) | |
| BIB | 6.7 (0.2) | |
| EDEN | 10.8 (0.5) | |
| KANC | 6.7 (0.5) | |
| RHEA | 6.3 (0.12) | |
| INMA | 8.6 (0.5) | |
| 0.4 (1.2) | ||
| Thinness (zBMI < − 2) | 1 | |
| Healthy (− 2 ≤ zBMI < 1) | 111 | |
| Overweight (1 ≤ zBMI ≤ 2) | 27 | |
| Obese (zBMI > 2) | 17 |
Fig. 2Variance partition analysis of omics data. Total variance was apportioned between cohort, inter-individual and intra-individual effects. A The heatmap colour (yellow to red) indicates the variance of features at each coordinate. B The violin plot describes the statistics of the variance explained by each component
Fig. 3Network representation of the Gaussian graphical model (GGM) of the DNA methylome, proteins, serum and urine metabolites with high intra-individual variability measured in 157 children from five European countries. Blue nodes represent CpG sites. Red and yellow nodes represent serum and urine metabolites, respectively. The opacity of the nodes is dependent on their degree (number of edges connecting a particular feature). The edge thickness was weighted based on the partial correlation coefficients (PCCs) obtained from the GGM
Fig. 4Main connected components of the Gaussian graphical model (GGM) network that involve direct associations between features from different omics layers. Blue nodes represent CpG sites. Red and yellow nodes represent serum and urine metabolites, respectively. The opacity of the nodes is dependent on their degree (number of edges connecting a particular feature). The edge thickness was weighted based on the partial correlation coefficients (PCCs) obtained from the GGM
Fig. 5Violin plots showing multi-omics variability decomposed by biological traits and sample collection parameters measured in the study. Labels correspond to omics features mostly explained by each variable. Abbreviations. Proteome: IL: interleukin; Apo A1: apolipoprotein A1; RA: receptor antagonist; CRP: c-reactive protein. Serum metabolome: C: acylcarnitine; SM: sphingomyelin; PC: phosphatidylcholine; lysoPC: lysophosphatidylcholine