| Literature DB >> 23423242 |
Cornelia Di Gaetano1, Giuseppe Matullo, Alberto Piazza, Moreno Ursino, Mauro Gasparini.
Abstract
Knowledge of markers in the human genome which show spatial patterns and display extreme correlation with different environmental determinants play an important role in understanding the factors which affect the biological evolution of our species. We used the genotype data of more than half a million single nucleotide polymorphisms (SNPs) from the data set Human Genome Diversity Panel (HGDP-CEPH -CEPH) and we calculated Spearman's correlation between absolute latitude and one of the two allele frequencies of each SNP. We selected SNPs with a correlation coefficient within the upper 1% tail of the distribution. We then used a criterion of proximity between significant variants to focus on DNA regions showing a continuous signal over a portion of the genome. Based on external information and genome annotations, we demonstrated that most regions with the strongest signals also have biological relevance. We believe this proximity requirement adds an edge to our novel method compared to the existing literature, highlighting several genes (for example DTNB, DOT1L, TPCN2, RELN, MSRA, NRG3) related to body size or shape, human height, hair color, and schizophrenia. Our approach can be applied generally to any measure of association between polymorphic frequencies and continuously varying environmental variables.Entities:
Keywords: adaptations; latitude; outlier approach; point processes; spatial patterns
Year: 2013 PMID: 23423242 PMCID: PMC3565544 DOI: 10.4137/EBO.S10211
Source DB: PubMed Journal: Evol Bioinform Online ISSN: 1176-9343 Impact factor: 1.625
Figure 1Graphical workflow process for the study.
Figure 2(Panel A) Histogram of the values of Spearman’s correlation coefficient over all the SNPs and theoretical approximate density of the Spearman’s correlation coefficient under the hypothesis of population null correlation. (Panel B) Histogram of the absolute values of Spearman’s correlation coefficient over all the SNPs. Using the outlier approach, we identify significant SNPs in the 1% upper tail of this distribution.
Figure 3Counting process representation of the location of the candidate regions of chromosome 1.
Notes: The thicker step function represents cumulative counts of all originally genotyped SNPS and refers to the main ordinate scale, on the left. The thinner step function represents cumulative counts of significant SNPs and refers to the ordinate scale on the right. Sixteen regions identified by our method are shown as small vertical segments on the abscissa axis. The zooming box on the upper left part of the graph shows two of them (gray bands) located around position 202500 kb, as guided by the arrows.
List of several genes reported in previously published GWAs and showing continuous correlation signals with our proximity based method.
| Reported gene(s) | Trait | Region | NCBI ID | Gene description | Reference |
|---|---|---|---|---|---|
| Erectile dysfunction and prostate cancer treatment | 9q22.32 | 84909 | Chromosome 9 open reading frame 3 | ||
| Response to amphetamine | 9q34.12 | 25 | v-abl Abelson murine leukemia viral oncogene homolog 1 | ||
| Adult human height | 2p23.3 | 1838 | Dystrobrevin, beta | ||
| Coronary heart disease | 2p23.3 | 1838 | Dystrobrevin, beta | ||
| Hair pigmentation in Europeans | 11q13.3 | 219931 | Two pore segment channel 2 | ||
| Associated with height | 19p13.3 | 84444 | DOT1-like, histone H3 methyltransferase ( | ||
| Susceptibility and clinical phenotype in multiple sclerosis | 7q22.1 | 5649 | Reelin | ||
| Increases the risk of schizophrenia only in women | 7q22.1 | 5649 | Reelin | ||
| Celiac disease | 4q27 | 59067 | Interleukin 21 | ||
| Protein quantitative trait loci | 5q35.1 | 1794 | Dedicator of cytokinesis 2 | ||
| Celiac disease | 3p14.1 | 23150 | FERM domain containing 4B | ||
| Hippocampal atrophy | 7q21.11 | 9863 | Membrane associated guanylate kinase, WW and PDZ domain containing 2 | ||
| Cognitive performance | 8q22.3 | 83988 | Neurocalcin delta | ||
| Response to iloperidone treatment (QT prolongation) | 10q23.1 | 10718 | Neuregulin 3 | ||
| Celiac disease | 1p36.11 | 864 | Runt-related transcription factor 3 | ||
| Quantitative traits | 7p22.2 | 221935 | Sidekick homolog 1 (chicken) | ||
| Adiposity | 8p23.1 | 4482 | Methionine sulfoxide reductase A | ||
| Hypertension | 8p23.1 | 4482 | Methionine sulfoxide reductase A | ||
| Schizophrenia | 8p23.1 | 4482 | Methionine sulfoxide reductase A | ||
| Bipolar disorder and schizophrenia | 8p23.1 | 4482 | Methionine sulfoxide reductase A |
Percentage of common SNPs when varying the minimum number of consecutive SNPs required.
| % concordance | 3 SNPs | 4 SNPs | 5 SNPs |
|---|---|---|---|
| 3 SNPs | 100.00% | 74.70% | 64.00% |
| 4 SNPs | 100.00% | 80.80% | |
| 5 SNPs | 100.00% |