Xi Long1,2, Hong Xue3,4,5. 1. Division of Life Science and Applied Genomics Centre, Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong, China. 2. HKUST Shenzhen Research Institute, 9 Yuexing First Road, Nanshan, Shenzhen, China. 3. Division of Life Science and Applied Genomics Centre, Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong, China. hxue@ust.hk. 4. HKUST Shenzhen Research Institute, 9 Yuexing First Road, Nanshan, Shenzhen, China. hxue@ust.hk. 5. Centre for Cancer Genomics, School of Basic Medicine and Clinical Pharmacy, China Pharmaceutical University, Nanjing, Jiangsu, China. hxue@ust.hk.
Abstract
BACKGROUND: Genetic variants, underlining phenotypic diversity, are known to distribute unevenly in the human genome. A comprehensive understanding of the distributions of different genetic variants is important for insights into genetic functions and disorders. METHODS: Herein, a sliding-window scan of regional densities of eight kinds of germline genetic variants, including single-nucleotide-polymorphisms (SNPs) and four size-classes of copy-number-variations (CNVs) in the human genome has been performed. RESULTS: The study has identified 44,379 hotspots with high genetic-variant densities, and 1135 hotspot clusters comprising more than one type of hotspots, accounting for 3.1% and 0.2% of the genome respectively. The hotspots and clusters are found to co-localize with different functional genomic features, as exemplified by the associations of hotspots of middle-size CNVs with histone-modification sites, work with balancing and positive selections to meet the need for diversity in immune proteins, and facilitate the development of sensory-perception and neuroactive ligand-receptor interaction pathways in the function-sparse late-replicating genomic sequences. Genetic variants of different lengths co-localize with retrotransposons of different ages on a "long-with-young" and "short-with-all" basis. Hotspots and clusters are highly associated with tumor suppressor genes and oncogenes (p < 10-10), and enriched with somatic tumor CNVs and the trait- and disease-associated SNPs identified by genome-wise association studies, exceeding tenfold enrichment in clusters comprising SNPs and extra-long CNVs. CONCLUSIONS: In conclusion, the genetic-variant hotspots and clusters represent two-edged swords that spearhead both positive and negative genomic changes. Their strong associations with complex traits and diseases also open up a potential "Common Disease-Hotspot Variant" approach to the missing heritability problem.
BACKGROUND: Genetic variants, underlining phenotypic diversity, are known to distribute unevenly in the human genome. A comprehensive understanding of the distributions of different genetic variants is important for insights into genetic functions and disorders. METHODS: Herein, a sliding-window scan of regional densities of eight kinds of germline genetic variants, including single-nucleotide-polymorphisms (SNPs) and four size-classes of copy-number-variations (CNVs) in the human genome has been performed. RESULTS: The study has identified 44,379 hotspots with high genetic-variant densities, and 1135 hotspot clusters comprising more than one type of hotspots, accounting for 3.1% and 0.2% of the genome respectively. The hotspots and clusters are found to co-localize with different functional genomic features, as exemplified by the associations of hotspots of middle-size CNVs with histone-modification sites, work with balancing and positive selections to meet the need for diversity in immune proteins, and facilitate the development of sensory-perception and neuroactive ligand-receptor interaction pathways in the function-sparse late-replicating genomic sequences. Genetic variants of different lengths co-localize with retrotransposons of different ages on a "long-with-young" and "short-with-all" basis. Hotspots and clusters are highly associated with tumor suppressor genes and oncogenes (p < 10-10), and enriched with somatic tumor CNVs and the trait- and disease-associated SNPs identified by genome-wise association studies, exceeding tenfold enrichment in clusters comprising SNPs and extra-long CNVs. CONCLUSIONS: In conclusion, the genetic-variant hotspots and clusters represent two-edged swords that spearhead both positive and negative genomic changes. Their strong associations with complex traits and diseases also open up a potential "Common Disease-Hotspot Variant" approach to the missing heritability problem.
Authors: Philip M Kim; Hugo Y K Lam; Alexander E Urban; Jan O Korbel; Jason Affourtit; Fabian Grubert; Xueying Chen; Sherman Weissman; Michael Snyder; Mark B Gerstein Journal: Genome Res Date: 2008-10-08 Impact factor: 9.043
Authors: Marc Pybus; Giovanni M Dall'Olio; Pierre Luisi; Manu Uzkudun; Angel Carreño-Torres; Pavlos Pavlidis; Hafid Laayouni; Jaume Bertranpetit; Johannes Engelken Journal: Nucleic Acids Res Date: 2013-11-25 Impact factor: 16.971