| Literature DB >> 16482228 |
Duc-Quang Nguyen1, Caleb Webber, Chris P Ponting.
Abstract
Although large-scale copy-number variation is an important contributor to conspecific genomic diversity, whether these variants frequently contribute to human phenotype differences remains unknown. If they have few functional consequences, then copy-number variants (CNVs) might be expected both to be distributed uniformly throughout the human genome and to encode genes that are characteristic of the genome as a whole. We find that human CNVs are significantly overrepresented close to telomeres and centromeres and in simple tandem repeat sequences. Additionally, human CNVs were observed to be unusually enriched in those protein-coding genes that have experienced significantly elevated synonymous and nonsynonymous nucleotide substitution rates, estimated between single human and mouse orthologues. CNV genes encode disproportionately large numbers of secreted, olfactory, and immunity proteins, although they contain fewer than expected genes associated with Mendelian disease. Despite mouse CNVs also exhibiting a significant elevation in synonymous substitution rates, in most other respects they do not differ significantly from the genomic background. Nevertheless, they encode proteins that are depleted in olfactory function, and they exhibit significantly decreased amino acid sequence divergence. Natural selection appears to have acted discriminately among human CNV genes. The significant overabundance, within human CNVs, of genes associated with olfaction, immunity, protein secretion, and elevated coding sequence divergence, indicates that a subset may have been retained in the human population due to the adaptive benefit of increased gene dosage. By contrast, the functional characteristics of mouse CNVs either suggest that advantageous gene copies have been depleted during recent selective breeding of laboratory mouse strains or suggest that they were preferentially fixed as a consequence of the larger effective population size of wild mice. It thus appears that CNV differences among mouse strains do not provide an appropriate model for large-scale sequence variations in the human population.Entities:
Mesh:
Year: 2006 PMID: 16482228 PMCID: PMC1366494 DOI: 10.1371/journal.pgen.0020020
Source DB: PubMed Journal: PLoS Genet ISSN: 1553-7390 Impact factor: 5.917
Significance Estimates of CNV Gene Properties
Figure 1Relative Frequency Histograms of Distances from Human CNVs to the Nearest Centromere or Telomere
Relative frequency histograms (striped blue bars) are compared to their expected distributions if CNVs were distributed randomly within the genome (grey bars); these expected distributions are fitted to Gaussian distributions (grey lines). Red lines represent 99.9999% prediction confidence intervals from the fitted curves.
Statistically Significant (p < 10−3) Over- or Under-Representation of Gene Ontology (GO) Categories in Human CNVs
Figure 2Relative Frequencies of the Ratio of K to K for Human–Mouse 1:1 Orthologous Genes
(A) K ratios for all human–mouse orthologue pairs (median K 0.094).
(B) K ratios for orthologue pairs of human genes that are completely encompassed in human CNVs (median K 0.112).
(C) K ratios for orthologue pairs of mouse genes completely encompassed in mouse CNVs (median K 0.081). A Kolmogorov-Smirnov test between (A) and (B) demonstrates that K values are significantly higher, on average, for human genes completely encompassed in human CNVs than for all human–mouse orthologue pairs (p = 1.7 × 10−2). On the other hand, genes completely encompassed in mouse CNVs exhibit significantly lower K values than all human–mouse orthologue pairs (p = 3.3 × 10−3).
Significance Estimates of Properties of “Frequent” Human CNVs Observed in Multiple Studies or “Rare” Human CNVs Observed in Single Studies
Significance Estimates of Properties of Human CNVs Duplicated or Deleted with Respect to the Human Genome Reference Sequence
Statistically Significant (p < 10−3) Over- or Under-Representation of Gene Ontology (GO) Categories in Mouse CNVs