| Literature DB >> 34828448 |
Hamdi Mbarek1, Massimiliano Cocca2, Yasser Al-Sarraj1, Chadi Saad1, Massimo Mezzavilla2, Wadha AlMuftah1, Dario Cocciadiferro3, Antonio Novelli3, Isabella Quinti4, Azza AlTawashi5, Salvino Salvaggio5, Asma AlThani1, Giuseppe Novelli6, Said I Ismail1.
Abstract
Host genomic information, specifically genomic variations, may characterize susceptibility to disease and identify people with a higher risk of harm, leading to better targeting of care and vaccination. Italy was the epicentre for the spread of COVID-19 in Europe, the first country to go into a national lockdown and has one of the highest COVID-19 associated mortality rates. Qatar, on the other hand has a very low mortality rate. In this study, we compared whole-genome sequencing data of 14398 adults and Qatari-national to 925 Italian individuals. We also included in the comparison whole-exome sequence data from 189 Italian laboratory-confirmed COVID-19 cases. We focused our study on a curated list of 3619 candidate genes involved in innate immunity and host-pathogen interaction. Two population-gene metric scores, the Delta Singleton-Cohort variant score (DSC) and Sum Singleton-Cohort variant score (SSC), were applied to estimate the presence of selective constraints in the Qatari population and in the Italian cohorts. Results based on DSC and SSC metrics demonstrated a different selective pressure on three genes (MUC5AC, ABCA7, FLNA) between Qatari and Italian populations. This study highlighted the genetic differences between Qatari and Italian populations and identified a subset of genes involved in innate immunity and host-pathogen interaction.Entities:
Keywords: COVID-19; COVID-19 severity; genetic constraints; population genetics
Mesh:
Year: 2021 PMID: 34828448 PMCID: PMC8623290 DOI: 10.3390/genes12111842
Source DB: PubMed Journal: Genes (Basel) ISSN: 2073-4425 Impact factor: 4.096
Figure 1PCA plot of the QGP and INGI cohorts projected onto 1000Genomes Project data. As expected, the first two principal components already show the separation between the QGP and the INGI cohorts and the overlap with the selected populations for the ancestry-related comparisons.
Figure 2Distributions of the prioritization scores. Violin plots of the distributions of DSC (top panel) and SSC (bottom panel) scores in the subset of selected genes for all target populations (CAR, FVG, VBI, QGP) and all reference outbred populations (AFR, EUR, SAS) from 1000Genomes project.
Results from comparison of DSC scores between target cohorts (CAR, FVG, VBI, QGP) and the relevant reference superpopulations from the 1000 Genomes Project (EUR, AFR, SAS). The last column refers to the nature of the comparison carried out, as detailed in Supplementary Table S3.
| DSC Score | |||||||||
|---|---|---|---|---|---|---|---|---|---|
| Transcript ID | Gene Name | QGP | CAR | FVG | VBI | EUR | AFR | SAS | Comparison |
| ENST00000369850 |
| 3.854 | −2.435 | 0.272 | −2.510 | −2.399 | −2.166 | −1.879 | C5 |
| ENST00000350763 |
| 3.370 | −3.792 | 1.388 | −0.631 | 2.666 | 1.838 | 2.187 | C4 |
| ENST00000389048 |
| 2.575 | 3.651 | 0.290 | −4.098 | 3.388 | 3.212 | 2.852 | C4 |
| ENST00000263094 |
| 2.566 | −0.433 | −2.168 | 0.071 | 3.020 | 2.681 | 2.150 | C4 |
| ENST00000647814 |
| 2.528 | −3.466 | 0.467 | 3.004 | 2.562 | 2.877 | 2.508 | C4 |
| ENST00000621226 |
| 2.435 | −2.404 | −2.017 | −2.892 | 3.477 | 3.500 | 3.032 | C4 |
| ENST00000634891 |
| 2.229 | −3.377 | −1.554 | −3.586 | −2.431 | 2.639 | −3.449 | C8 |
| ENST00000542267 |
| 2.026 | −1.232 | 0.180 | −2.658 | 3.086 | 0.266 | 2.477 | C4 |
| ENST00000589042 |
| −2.242 | −2.595 | 3.411 | 4.584 | −2.388 | −1.965 | 3.498 | C1 |
| ENST00000357387 |
| −2.369 | −2.206 | −0.033 | 2.407 | 2.181 | 1.284 | −4.070 | C7 |
| ENST00000561890 |
| −2.472 | −1.682 | −1.632 | 2.156 | −2.562 | −2.191 | −2.425 | C3 |
| ENST00000336596 |
| −3.001 | 2.139 | −3.421 | −1.535 | −3.704 | 1.921 | −2.814 | C3 |
| ENST00000648947 |
| −3.444 | −1.309 | 2.424 | −2.928 | −3.439 | −2.692 | −0.727 | C3 |
| ENST00000389484 |
| −4.888 | −4.916 | 3.192 | 2.602 | −4.245 | −2.136 | 3.231 | C1 |
Results from comparison of SSC scores between target cohorts (CAR, FVG, VBI, QGP) and the relevant reference superpopulations from the 1000 Genomes Project (EUR, AFR, SAS). The last column refers to the nature of the comparison carried out, as detailed in Supplementary Table S3.
| SSC Score | |||||||||
|---|---|---|---|---|---|---|---|---|---|
| Transcript ID | Gene Name | QGP | CAR | FVG | VBI | EUR | AFR | SAS | Comparison |
| ENST00000378473 |
| −4.524 | 2.917 | −3.907 | −0.413 | −4.272 | −2.873 | −3.079 | C3 |
| ENST00000366574 |
| −4.347 | 3.792 | −4.694 | −5.053 | −3.704 | −2.318 | −2.087 | C3 |
| ENST00000315872 |
| −3.680 | 3.822 | −2.372 | −1.149 | −4.160 | −3.248 | −3.665 | C3 |
| ENST00000361445 |
| −3.371 | −0.975 | 0.100 | 2.659 | −3.378 | −4.276 | −3.888 | C3 |
| ENST00000358691 |
| −3.131 | 4.249 | 1.658 | −3.729 | −3.437 | −2.929 | 3.413 | C1 |
| ENST00000355286 |
| −3.000 | −1.900 | 2.671 | −1.486 | −2.087 | −2.589 | −0.756 | C3 |
| ENST00000381501 |
| −2.996 | −2.615 | −1.656 | 3.085 | −2.497 | −2.427 | −0.767 | C3 |
| ENST00000265382 |
| −2.952 | 2.574 | −0.576 | −2.746 | −3.246 | −3.197 | −1.583 | C3 |
| ENST00000359015 |
| −2.758 | 2.108 | 0.850 | 1.431 | −3.555 | −2.472 | −3.293 | C3 |
| ENST00000335670 |
| −2.586 | −2.953 | 1.176 | 2.228 | −2.499 | −2.743 | −0.526 | C3 |
| ENST00000370056 |
| −2.523 | 3.467 | 1.312 | 1.038 | −3.282 | −2.646 | −1.406 | C3 |
| ENST00000381298 |
| −2.522 | −1.224 | 3.859 | 2.542 | −2.120 | −1.466 | −2.644 | C3 |
| ENST00000432237 |
| −2.506 | −1.404 | 2.793 | −2.306 | −2.419 | −0.629 | −2.156 | C3 |
| ENST00000392552 |
| −2.338 | −1.261 | −1.336 | 2.338 | −2.417 | −1.608 | −2.585 | C3 |
| ENST00000382292 |
| −2.324 | −4.408 | 3.917 | 2.284 | −3.530 | −2.726 | −2.082 | C3 |
| ENST00000392132 |
| −2.176 | −2.147 | 2.722 | −1.257 | −2.673 | −2.107 | −1.787 | C3 |
| ENST00000313708 |
| −2.068 | 2.253 | −1.422 | −0.914 | −2.980 | −1.665 | −3.222 | C3 |
| ENST00000400841 |
| 2.036 | −1.347 | −1.496 | −2.083 | 2.581 | 2.185 | 1.052 | C4 |
| ENST00000369850 |
| 2.058 | −3.158 | −0.860 | −4.025 | −3.073 | −3.351 | −3.097 | C5 |
| ENST00000344327 |
| 2.062 | −3.776 | −2.671 | −2.711 | −3.382 | 0.278 | −2.242 | C5 |
| ENST00000263317 |
| 2.225 | −2.716 | −2.554 | −2.717 | 2.134 | 3.532 | 3.770 | C4 |
| ENST00000403662 |
| 2.237 | −2.363 | 1.702 | 0.620 | 2.613 | 0.319 | 2.782 | C4 |
| ENST00000297494 |
| 2.243 | 1.436 | −2.178 | −0.851 | 2.109 | 2.460 | 2.455 | C4 |
| ENST00000295598 |
| 2.258 | −2.679 | 0.547 | 0.930 | −2.204 | −1.886 | −2.448 | C5 |
| ENST00000085219 |
| 2.288 | 0.576 | 0.028 | −2.311 | 2.368 | −0.600 | 2.142 | C4 |
| ENST00000305877 |
| 2.397 | −1.338 | 2.028 | −3.021 | 3.994 | 2.923 | 3.631 | C4 |
| ENST00000333149 |
| 2.501 | 2.275 | 1.138 | −2.022 | 2.271 | 1.372 | 3.197 | C4 |
| ENST00000271332 |
| 2.522 | 3.651 | −2.581 | 2.455 | 2.443 | −2.129 | 2.936 | C2 |
| ENST00000447648 |
| 2.666 | −2.351 | 1.822 | −1.669 | 2.777 | 3.213 | −0.027 | C4 |
| ENST00000324856 |
| 3.434 | −3.120 | −2.683 | −1.845 | −2.085 | −2.053 | 0.835 | C5 |
| ENST00000263094 |
| 3.796 | −1.601 | −2.581 | 1.004 | 2.591 | 3.325 | 3.998 | C4 |
| ENST00000372923 |
| 3.941 | −2.514 | −1.222 | −0.710 | −2.066 | −1.575 | −2.077 | C5 |
| ENST00000621226 |
| 3.965 | −3.705 | −3.751 | −4.601 | 3.679 | 3.267 | 4.244 | C4 |
| ENST00000533211 |
| 4.531 | −2.266 | 1.209 | −1.893 | 2.589 | 2.191 | 2.778 | C4 |
| ENST00000529681 |
| 4.744 | 3.085 | 1.483 | −2.070 | 4.884 | 4.396 | 5.095 | C4 |
List of genes with a concordant signature of selection between DSC and SSC scores, after the comparison between target cohorts (CAR, FVG, VBI, QGP) and the relevant reference superpopulations from the 1000 Genomes Project (EUR, AFR, SAS).
| DSC Score | SSC Score | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Transcript ID | Gene Name | QGP | CAR | FVG | VBI | EUR | AFR | SAS | QGP | CAR | FVG | VBI | EUR | AFR | SAS |
| ENST00000369850 |
| 3.854 | −2.435 | 0.272 | −2.510 | −2.399 | −2.166 | −1.879 | 2.058 | −3.158 | −0.860 | −4.025 | −3.073 | −3.351 | −3.097 |
| ENST00000263094 |
| 2.566 | −0.433 | −2.168 | 0.071 | 3.020 | 2.681 | 2.150 | 3.796 | −1.601 | −2.581 | 1.004 | 2.591 | 3.325 | 3.998 |
| ENST00000621226 |
| 2.435 | −2.404 | −2.017 | −2.892 | 3.477 | 3.500 | 3.032 | 3.965 | −3.705 | −3.751 | −4.601 | 3.679 | 3.267 | 4.244 |
Comparison of Singleton burden between the COVID-19 positive cohort (TOV) and other target and reference populations. The reported p-values refer to the comparison between whole gene singleton burden (“p-value whole gene” column) and coding regions singletons burden (“p-value CDS region” column). All singleton counts have been adjusted considering the sample size of each cohort.
| Transcript ID | Gene Name | Cohort | ||
|---|---|---|---|---|
| ENST00000369850 |
| CAR | 0.630140 | 0.409653 |
| FVG | 0.046901 | 0.316565 | ||
| VBI | 0.000458 | 0.013015 | ||
| QGP | 0.000028 | 0.039803 | ||
| EUR | 0.312323 | 0.786342 | ||
| AFR | 0.878006 | 0.787767 | ||
| SAS | 0.200408 | 0.813561 | ||
| ENST00000263094 |
| CAR | 3.2746 × 10−11 | 2.5959 × 10−7 |
| FVG | 3.1278 × 10−23 | 1.0535 × 10−17 | ||
| VBI | 7.1607 × 10−21 | 2.0060 × 10−16 | ||
| QGP | 6.2413 × 10−63 | 1.5606 × 10−40 | ||
| EUR | 4.4467 × 10−10 | 1.0966 × 10−8 | ||
| AFR | 1.7360 × 10−8 | 5.3713 × 10−9 | ||
| SAS | 2.3435 × 10−4 | 3.7924 × 10−6 | ||
| ENST00000621226 |
| CAR | 7.4274 × 10−12 | 8.4692 × 10−11 |
| FVG | 1.8836 × 10−31 | 2.7623 × 10−24 | ||
| VBI | 2.3148 × 10−36 | 3.2118 × 10−30 | ||
| QGP | 1.5512 × 10−76 | 1.2101 × 10−64 | ||
| EUR | 3.3142 × 10−6 | 4.0071 × 10−8 | ||
| AFR | 2.9701 × 10−8 | 1.0241 × 10−10 | ||
| SAS | 7.2316 × 10−3 | 9.9471 × 10−5 |