| Literature DB >> 20174570 |
Matteo Fumagalli1, Uberto Pozzoli, Rachele Cagliani, Giacomo P Comi, Nereo Bresolin, Mario Clerici, Manuela Sironi.
Abstract
Viruses have exerted a constant and potent selective pressure on human genes throughout evolution. We utilized the marks left by selection on allele frequency to identify viral infection-associated allelic variants. Virus diversity (the number of different viruses in a geographic region) was used to measure virus-driven selective pressure. Results showed an excess of variants correlated with virus diversity in genes involved in immune response and in the biosynthesis of glycan structures functioning as viral receptors; a significantly higher than expected number of variants was also seen in genes encoding proteins that directly interact with viral components. Genome-wide analyses identified 441 variants significantly associated with virus-diversity; these are more frequently located within gene regions than expected, and they map to 139 human genes. Analysis of functional relationships among genes subjected to virus-driven selective pressure identified a complex network enriched in viral products-interacting proteins. The novel approach to the study of infectious disease epidemiology presented herein may represent an alternative to classic genome-wide association studies and provides a large set of candidate susceptibility variants for viral infections.Entities:
Mesh:
Year: 2010 PMID: 20174570 PMCID: PMC2824813 DOI: 10.1371/journal.pgen.1000849
Source DB: PubMed Journal: PLoS Genet ISSN: 1553-7390 Impact factor: 5.917
Populations in the HGDP-CEPH panel and virus diversity estimates.
| Population | Country | Sampled individuals | Virus diversity |
| Bantu North East | Kenya | 11 | 49 |
| Bantu South East | South Africa | 8 | 46 |
| Biaka Pygmies | Central African Republic | 23 | 54 |
| Mandenka | Senegal | 22 | 51 |
| Mbuti Pygmies | Democratic Republic of Congo | 13 | 50 |
| San | Namibia | 5 | 42 |
| Yoruba | Nigeria | 21 | 54 |
| Colombians | Colombia | 7 | 49 |
| Karitiana | Brazil | 14 | 55 |
| Maya | Mexico | 21 | 49 |
| Pima | Mexico | 14 | 49 |
| Surui | Brazil | 8 | 55 |
| Balochi | Pakistan | 24 | 45 |
| Brahui | Pakistan | 25 | 45 |
| Burusho | Pakistan | 25 | 45 |
| Hazara | Pakistan | 22 | 45 |
| Kalash | Pakistan | 23 | 45 |
| Makrani | Pakistan | 25 | 45 |
| Pathan | Pakistan | 23 | 45 |
| Sindhi | Pakistan | 24 | 45 |
| Uygur | China | 10 | 47 |
| Cambodians | Cambodia | 10 | 42 |
| Dai | China | 10 | 47 |
| Daur | China | 9 | 47 |
| Han | China | 44 | 47 |
| Hezhen | China | 9 | 47 |
| Japanese | Japan | 29 | 41 |
| Lahu | China | 8 | 47 |
| Miaozu | China | 10 | 47 |
| Mongola | China | 10 | 47 |
| Naxi | China | 8 | 47 |
| Oroqen | China | 9 | 47 |
| She | China | 10 | 47 |
| Tu | China | 10 | 47 |
| Tujia | China | 10 | 47 |
| Xibo | China | 9 | 47 |
| Yakut | Russia | 25 | 48 |
| Yizu | China | 10 | 47 |
| Adygei | Russia | 17 | 48 |
| French | France | 28 | 42 |
| French Basque | France | 24 | 42 |
| North Italian | Italy | 13 | 43 |
| Orcadian | Orkney Islands (Scotland) | 15 | 39 |
| Russian | Russia | 25 | 48 |
| Sardinian | Italy | 28 | 43 |
| Tuscan | Italy | 8 | 43 |
| Bedouin | Israel | 46 | 41 |
| Druze | Israel | 42 | 41 |
| Mozabite | Algeria | 29 | 39 |
| Palestinian | Israel | 46 | 41 |
| NAN Melanesian | Papua New Guinea | 11 | 45 |
| Papuan | Papua New Guinea | 17 | 45 |
Enrichment of SNPs significantly associated with virus diversity in different gene lists.
| Gene list | Genes | SNPs | Corr. SNPs |
| Contributing genes |
| InnateDB | 2915 | 59783 | 104 | 0.0105 |
|
| Glycan biosynthesis | 200 | 5343 | 50 | 0.0138 |
|
| Host-virus interaction | 1916 | 14746 | 80 | 0.0172 |
|
Number of SNPs showing significant correlation with virus diversity.
The empirical p value was calculated as described in the text and in Materials and Methods.
Genes showing at least one SNP significantly correlated with virus diversity.
Top 30 SNPs (or SNP clusters) correlated with virus diversity.
| SNP | Gene symbol | Description | Annotation | τ |
| rs10511316 |
| coiled-coil domain containing 80 | intron | 0.627 |
| rs1135029; rs189332; rs11235559 |
| phosphodiesterase 2A, cGMP-stimulated | A867A; intron; intron | 0.615 |
| rs1011051; rs2278295 |
| myosin VC | intron; intron | 0.609 |
| rs993715; rs2189883 |
| contactin associated protein-like 2 | intron; intron | 0.609 |
| rs11581 |
| - | Q1642Q | 0.607 |
| rs3785415 |
| cadherin 15, type 1, M-cadherin | intron | 0.603 |
| rs17256082 |
| secernin 3 | intron | 0.600 |
| rs4852988 |
| annexin A4 | intron | 0.597 |
| rs4575989; rs4629443 |
| C1q and tumor necrosis factor related protein 7 | intron; intron | 0.597 |
| rs7637370 |
| claudin 18 | intron | 0.596 |
| rs519332 |
| eyes absent 4 homolog | intron | 0.596 |
| rs2188172; rs11760238 |
| lipoma HMGIC fusion partner-like 3 | intron; intron | 0.595 |
| rs1650893 |
| - | Q42R | 0.594 |
| rs1322633 |
| ring finger protein 217 | intron | 0.593 |
| rs7927476 |
| NEL-like 1 | intron | 0.593 |
| rs2615666 |
| transmembrane protein 132B | intron | 0.593 |
| rs13020779 |
| DIS3 mitotic control homolog (S. cerevisiae)-like 2 | intron | 0.589 |
| rs1719596 |
| leprecan-like 1 | intron | 0.589 |
| rs1065154 |
| sequestosome 1 | 3′ UTR | 0.589 |
| rs12145973 |
| Interleukin 19 | intron | 0.589 |
| rs1890139 |
| propionyl Coenzyme A carboxylase, alpha polypeptide | intron | 0.588 |
| rs6505045 |
| ankyrin-repeat and fibronectin type III domain containing 1 | intron | 0.587 |
| rs4953260 |
| protein kinase C, epsilon | intron | 0.587 |
| rs4077341 |
| tumor necrosis factor receptor superfamily, member 10c, decoy without an intracellular domain | intron | 0.587 |
| rs2793434 |
| glycosylphosphatidylinositol specific phospholipase D1 | intron | 0.587 |
| rs6599300 |
| macrophage erythroblast attacher | intron | 0.584 |
| rs13340461 |
| cyclin D3 | intron | 0.584 |
| rs11784487 |
| Ankyrin 1 | intron | 0.584 |
| rs10849446 |
| sodium channel, nonvoltage-gated 1 alpha | intron | 0.583 |
| rs12186418 |
| PDZ domain containing 2 | intron | 0.583 |
For nonsynonymous substitutions the aminoacid change is reported.
SNPs are ranked according to τ values. For multiple correlating SNPs in the same gene, the correlation coefficient is only shown for the strongest SNP.
Figure 1Network analysis of genes associated with virus diversity.
Interactions between human proteins are delimited by the hatched grey circle. Genes are represented as nodes; edges indicate known interactions (sold lines depicts direct and hatched lines depict indirect interaction). Human genes are colour-coded as follows: orange, genes with at least one SNP significantly associated with virus diversity; yellow, genes with at least one SNP that did not withstand genome-wide Bonferroni correction but displayed a rank higher than the 99th and a p value lower than 10−5 (these genes were not included in the input IPA list used to generate networks); grey, genes covered by at least one SNP in the HGDP-CEPH panel; white, genes with no SNPs in the panel. Virus-host interactions are shown for genes subjected to virus-driven selection only; genes interacting with viral products that display no SNP significantly associated with virus diversity are denoted with an asterisk. Viral products are reported outside the hatched circle and colour coded as follows: purple, HIV-1; green, Human herpesvirus; blue, Human rotavirus G3; cyan, Human adenovirus 2; black, Human T-lymphotropic virus 1.
Significantly over-represented PANTHER categories.
| PANTHER category | PANTHER description | Number of genes |
|
|
| Signal transduction | 61 | 1.74×10−9 |
| Cell adhesion-mediated signalling | 16 | 9.10×10−6 | |
| Cell adhesion | 16 | 7.98×10−4 | |
| Cell communication | 24 | 2.79×10−3 | |
| Neuronal activities | 13 | 1.43×10−2 | |
| Carbohydrate metabolism | 13 | 2.05×10−2 | |
| Extracellular matrix protein-mediated signalling | 5 | 2.47×10−2 | |
| Immunity and defense | 21 | 3.56×10−2 | |
|
| Receptor | 30 | 4.27×10−5 |
| Other receptor | 11 | 3.19×10−4 | |
| Extracellular matrix linker protein | 4 | 5.27×10−3 | |
| Extracellular matrix | 10 | 2.29×10−2 |
Number of genes that correlate with virus diversity in each PANTHER category.
p values are Bonferroni corrected.