| Literature DB >> 16362079 |
Barbara E Stranger1, Matthew S Forrest, Andrew G Clark, Mark J Minichiello, Samuel Deutsch, Robert Lyle, Sarah Hunt, Brenda Kahl, Stylianos E Antonarakis, Simon Tavaré, Panagiotis Deloukas, Emmanouil T Dermitzakis.
Abstract
The exploration of quantitative variation in human populations has become one of the major priorities for medical genetics. The successful identification of variants that contribute to complex traits is highly dependent on reliable assays and genetic maps. We have performed a genome-wide quantitative trait analysis of 630 genes in 60 unrelated Utah residents with ancestry from Northern and Western Europe using the publicly available phase I data of the International HapMap project. The genes are located in regions of the human genome with elevated functional annotation and disease interest including the ENCODE regions spanning 1% of the genome, Chromosome 21 and Chromosome 20q12-13.2. We apply three different methods of multiple test correction, including Bonferroni, false discovery rate, and permutations. For the 374 expressed genes, we find many regions with statistically significant association of single nucleotide polymorphisms (SNPs) with expression variation in lymphoblastoid cell lines after correcting for multiple tests. Based on our analyses, the signal proximal (cis-) to the genes of interest is more abundant and more stable than distal and trans across statistical methodologies. Our results suggest that regulatory polymorphism is widespread in the human genome and show that the 5-kb (phase I) HapMap has sufficient density to enable linkage disequilibrium mapping in humans. Such studies will significantly enhance our ability to annotate the non-coding part of the genome and interpret functional variation. In addition, we demonstrate that the HapMap cell lines themselves may serve as a useful resource for quantitative measurements at the cellular level.Entities:
Mesh:
Year: 2005 PMID: 16362079 PMCID: PMC1315281 DOI: 10.1371/journal.pgen.0010078
Source DB: PubMed Journal: PLoS Genet ISSN: 1553-7390 Impact factor: 5.917
Figure 1QQ Plot of cis versus trans HSA20 −log10 p-Values
The figure shows the contrast of −log10 p-values deriving from associations of SNPs and genes within the 10-Mb region of HSA20 with −log10 p-values deriving from associations between genes on the 10-Mb region HSA20 with SNPs in one of ten ENCODE regions. Note that the distribution falls off the diagonal around −log10 p = 4, which we consider the borderline for the high enrichment of cis significant effects. A similar pattern is observed with any set of trans −log10 p-values on HSA20 or any other cis vs. trans contrast in any region we tested.
Figure 2Cis- Signals of SNP−Gene Associations in the Human Genome
(A) The relationship between statistical significance and distance from gene. Each data point represents the maximum −log10 p for a single gene and SNPs located cis- to its coding locus. The −log10 p-values from the additive model are plotted as a function of distance between the center of the genomic span of the gene and cis- located SNPs (cis- < 4 Mb). Only those gene-SNP associations that have −log10 p > 4 are shown. SNPs are from the 5-kb HapMap. This plot includes data for 101 genes (129 probes). (B) Cis- SNPs with −log10 p ≥ 4 from the 688 probes analyzed are plotted against their chromosomal location on NCBI34 coordinates of the human genome.
Comparison of Multiple-Test Correction Methods
Genes with Significant cis and trans Associations
Figure 3Examples of cis- Associations from the Genome-Wide and High-Density SNP Maps
(A) Genomic location of associated SNPs close to the SERPINB10 gene. Custom tracks in the UCSC genome browser (http://genome.ucsc.edu) show the location of the Illumina probe and proximal SNPs in the context of genome annotation. The lower horizontal black line indicates the −log10 p threshold where the corresponding q-value is 0.05 (i.e., any SNPs with values −log10 p that meet or exceed this threshold are significant at the q = 0.05 level), and the upper line is the Bonferroni genome-wide threshold. Additional tracks describe known genes, first-exon and promoter predictions, conserved transcription factor binding sites, Gencode genes, RNA polymerase 2, and Transcription factor 2 binding sites, identified by Affymetrix ChIP/chip experiments, and Sp1 and Sp3 binding sites identified by Stanford's ChIP/chip experiments. Consensus conserved elements are shown in the final track. HapMap LD information below is for the CEU individuals and suggests that there are two conserved haplotype clusters in this region.