Literature DB >> 25015648

selscan: an efficient multithreaded program to perform EHH-based scans for positive selection.

Abstract

Haplotype-based scans to detect natural selection are useful to identify recent or ongoing positive selection in genomes. As both real and simulated genomic data sets grow larger, spanning thousands of samples and millions of markers, there is a need for a fast and efficient implementation of these scans for general use. Here, we present selscan, an efficient multithreaded application that implements Extended Haplotype Homozygosity (EHH), Integrated Haplotype Score (iHS), and Cross-population EHH (XPEHH). selscan accepts phased genotypes in multiple formats, including TPED, and performs extremely well on both simulated and real data and over an order of magnitude faster than existing available implementations. It calculates iHS on chromosome 22 (22,147 loci) across 204 CEU haplotypes in 353 s on one thread (33 s on 16 threads) and calculates XPEHH for the same data relative to 210 YRI haplotypes in 578 s on one thread (52 s on 16 threads). Source code and binaries (Windows, OSX, and Linux) are available at https://github.com/szpiech/selscan.

Entities: Chemical Disease Gene Species

Mesh：

Year: 2014 PMID： 25015648 PMCID： PMC4166924 DOI： 10.1093/molbev/msu211

Source DB: PubMed Journal: Mol Biol Evol ISSN： 0737-4038 Impact factor: 16.240

Introduction

Extended Haplotype Homozygosity (EHH) (Sabeti et al. 2002), Integrated Haplotype Score (iHS) (Voight et al. 2006), and Cross-population Extended Haplotype Homozygosity (XPEHH) (Sabeti et al) are statistics designed to use phased genotypes to identify putative regions of recent or ongoing positive selection in genomes. They are all based on the model of a hard selective sweep, where a de novo adaptive mutation arises on a haplotype that quickly sweeps toward fixation, reducing diversity around the locus. If selection is strong enough, this occurs faster than recombination or mutation can act to break up the haplotype, and thus a signal of high haplotype homozygosity can be observed extending from an adaptive locus. As genetics data sets grow larger both in number of individuals and number of loci, there is a need for a fast and efficient publicly available implementation of these statistics. Below, we introduce these statistics and provide concise definitions for their calculations. We then evaluate the performance of our implementation, selscan.

Extended Haplotype Homozygosity

In a sample of n chromosomes, let denote the set of all possible distinct haplotypes at a locus of interest (named x0), and let denote the set of all possible distinct haplotypes extending from the locus x0 to the i-th marker either upstream or downstream from x0. For example, if the locus of interest x0 is a biallelic single nucleotide polymorphism (SNP) where 0 represents the ancestral allele and 1 represents the derived allele, then . If x1 is an immediately adjacent marker, then the set of all possible haplotypes is . EHH of the entire sample, extending from the locus x0 out to marker x, is calculated as where n is the number of observed haplotypes of type . In some cases, we may want to calculate the haplotype homozygosity of a subsample of chromosomes all carrying a “core” haplotype at locus x0. Let be a partition of containing all distinct haplotypes carrying the core haplotype, , at x0 and extending to marker x. Note that Following the example above, if the derived allele (1) is chosen as the core haplotype, then . Similarly, if the ancestral allele is the core haplotype, then . We calculate the EHH of the chromosomes carrying the core haplotype c to marker x as where n is the number of observed haplotypes of type and n is the number of observed haplotypes carrying the core haplotype ().

Integrated Haplotype Score

iHS is calculated by using equation (3) to track the decay of haplotype homozygosity for both the ancestral and derived haplotypes extending from a query site. To calculate iHS at a site, we first calculate the integrated haplotype homozygosity (iHH) for the ancestral (0) and derived (1) haplotypes () via trapezoidal quadrature. where is the set of markers downstream from the current locus such that denotes the i-th closest downstream marker from the locus of interest (x0). and are defined similarly for upstream markers. gives the genetic distance between two markers. The (unstandardized) iHS is then calculated as Note that this definition differs slightly from that in Voight et al, where unstandardized iHS is defined with iHH1 and iHH0 swapped. Finally, the unstandardized scores are normalized in frequency bins across the entire genome. where and are the expectation and standard deviation in frequency bin p. In practice, the summations in equation (4) are truncated once . Additionally with low density SNP data, if the physical distance b (in kbp) between two markers is >20, then is scaled by a factor of 20/b in order to reduce possible spurious signals induced by lengthy gaps. During computation if the start/end of a chromosome arm is reached before or if a gap of b > 200 is encountered, the iHS calculation is aborted for that locus. iHS is not reported at core sites with minor allele frequency (MAF) < 0.05. In selscan, the EHH truncation value, gap scaling factor, and core site MAF cutoff value are all flexible parameters definable on the command line.

Cross-population Extended Haplotype Homozygosity

To calculate XPEHH between populations A and B at a marker x0, we first calculate iHH for each population separately, integrating the EHH of the entire sample in the population (eq. 1). If iHH and iHH are the iHHs for populations A and B, then the (unstandardized) XPEHH is and after genome-wide normalization we have In practice, the sums in each of iHH and iHH (eq. 7) are truncated at x—the marker at which the EHH of the haplotypes pooled across populations is . Scaling of and handling of gaps is done as for iHS, and these parameters are definable on the selscan command line.

Performance

Here, we evaluate the performance of selscan (https://github.com/szpiech/selscan, last accessed July 16, 2014) for computing the iHS and XPEHH statistics. In addition, we compare performance on these statistics with the programs rehh (Gautier and Vitalis 2012, http://cran.r-project.org/package=rehh, last accessed July 16, 2014), ihs (Voight et al), and xpehh (Pickrell et al). Both ihs and xpehh are available for download at http://hgdp.uchicago.edu/Software/ (last accessed July 16, 2014). All computations were run on a MacPro running OSX 10.8.5 with two 2.4 GHz 6–core Intel Xeon processors with hyperthreading enabled. For runtime evaluation of iHS calculations, we simulated a 4 Mbp region of DNA with the program ms (Hudson 2002) and generated four independent data sets with varying numbers of sampled haplotypes ( and ). We sampled 250 haplotypes (9,625 SNP loci), 500 haplotypes (10,646 SNP loci), 1,000 haplotypes (11,655 SNP loci), and 2,000 haplotypes (12,724 SNP loci). We name these data sets IHS250, IHS500, IHS1000, and IHS2000, respectively. These data sets represent a densely typed region similar to next-generation sequencing data. Although these data sets are generated via strictly neutral processes, they serve the purpose of runtime evaulation perfectly well. We also use data from The 1000 Genomes Project (1000 Genomes Project Consortium 2012) Omni genotypes, calculating iHS scores at 22,147 SNP loci on chromosome 22 across 102 CEU individuals (204 haplotypes). We name this data set CEU22. Table 1 summarizes the runtimes of ihs, rehh, and selscan. We note that rehh integrates haplotype homozygosity over a physical map, whereas ihs and selscan integrate over a genetic map by default. This does not affect runtimes (data not shown), which are measured using genetic maps for ihs and selscan. Even operating on a single thread, selscan calculates iHS scores at least an order of magnitude faster than ihs and up to 1.8 × faster than rehh for large data sets.

Table 1.

Runtime Performance (in seconds) of ihs, rehh, and selscan for Calculating Unstandardized iHS for Various Data Sets.

Data Set	ihs	rehh^a	selscan
			Threads =1	2	4	8	16
IHS250	19,275	563	618	306	162	84	58
IHS500	45,547	1,652	1,554	782	399	220	150
IHS1000	>100,000	4,834	4,018	2,019	1,040	566	380
IHS2000	>100,000	12,652	7,054	3,633	1,869	1,046	752
CEU22	19,434	588	353	182	93	50	33

Note.—Calculations running over 100,000 s were aborted.

arehh integrates over a physical map instead of a genetic map. Using a physical map does not affect selscan’s runtime (data not shown).

Runtime Performance (in seconds) of ihs, rehh, and selscan for Calculating Unstandardized iHS for Various Data Sets. Note.—Calculations running over 100,000 s were aborted. arehh integrates over a physical map instead of a genetic map. Using a physical map does not affect selscan’s runtime (data not shown). We compare unstandardized iHS scores for the CEU22 data set using ihs and selscan and find excellent agreement (fig. 1A, Pearson’s r = 0.9946). The slight variance in scores between the two programs is likely due to an undocumented difference in the way ihs calculates its scores (supplementary material of Sabeti et al), but the effect is negligible. We also calculate unstandardized iHS scores for the CEU22 data set using rehh and selscan (using a physical map) and again find excellent agreement (Pearson’s r = 0.9953).

(A) Unstandardized iHS scores calculated on the CEU22 data set for selscan and ihs (Pearson’s r = 0.9946) and (B) Unstandardized XPEHH scores calculated on the CEUYRI22 data set for selscan and xpehh (Pearson’s r = 0.9999).

Cross-Population Extended Haplotype Homozygosity

For runtime evaluation of XPEHH calculations, we simulated a 4-Mbp region of DNA with the program ms (Hudson 2002) with a simple two population divergence model (time to divergence t = 0.05, , and ) and generated four independent data sets with varying numbers of sampled haplotypes. We sampled 250 haplotypes (125 from each population, 12,920 SNP loci), 500 haplotypes (250 from each population, 14,989 SNP loci), 1,000 haplotypes (500 from each population, 17,142 SNP loci), and 2,000 haplotypes (1,000 from each population, 19,567 SNP loci). We name these data sets XP250, XP500, XP1000, and XP2000, respectively. These data sets represent a densely typed region similar to next-generation sequencing data. Although these data sets are generated via strictly neutral processes, they serve the purpose of runtime evaulation perfectly well. We also use data from The 1000 Genomes Project (1000 Genomes Project Consortium 2012) Omni genotypes, calculating XPEHH scores at 22, 147 SNP loci on chromosome 22 across 102 CEU individuals (204 haplotypes) and 105 YRI individuals (210 haplotypes). We name this data set CEUYRI22. Table 2 summarizes the runtimes of xpehh and selscan. Even operating on a single thread, selscan tends to calculate XPEHH scores at least an order of magnitude faster than xpehh. Figure 1B shows the correlation (Pearson’s r = 0.9999) of CEUYRI22 unstandardized XPEHH scores between the two programs.

Table 2.

Runtime Performance (in seconds) of xpehh and selscan for Calculating Unstandardized XPEHH for Various Data Sets.

Data Set	xpehh	selscan
		Threads =1	2	4	8	16
XP250	11,113	287	141	71	38	25
XP500	57,006	766	403	194	104	67
XP1000	>100,000	2,037	1,018	515	274	180
XP2000	>100,000	5,683	2,798	1,471	763	493
CEUYRI22	37,271	578	291	150	78	52

Note.—Calculations running over 100,000 s were aborted.

Runtime Performance (in seconds) of xpehh and selscan for Calculating Unstandardized XPEHH for Various Data Sets. Note.—Calculations running over 100,000 s were aborted.

Conclusions

selscan achieves a speed up of at least an order or magnitude over both ihs and xpehh and a speed up of nearly 2x over rehh for large data sets through general optimizations of the calculations. We also implement shared memory parallelism with multithreading to further speed up calculations on computers with multiple cores. Because iHS and XPEHH attempt to calculate a score for each site in the data and each score can be calculated indpendently of the others, selscan partitions the workload (sites at which to calculate a score) across threads, while maintaining each thread’s access to the entire data set required to make the calculation. Additional empirical testing (data not shown) suggests that rehh, ihs, and selscan (for both iHS and XPEHH calculations) are , and xpehh is , where N is the number of haploid samples and D is the SNP locus density. Each of these statistics require phased haplotypes and a genetic or physical map as input data (TPED format) and missing genotypes must either be dropped or imputed. Because of the speed improvements we have implented, we expect that selscan will be a valuable tool for calculating EHH-based genome-wide scans for positive selection in very large genetic data sets, including whole-genome sequencing and genome-wide association study data, currently being generated for humans and other organisms. selscan will also allow for in-depth examination of the performance of these statistics under a wide range of parameters in large-scale simulation studies.

7 in total

1. Generating samples under a Wright-Fisher neutral model of genetic variation.

Authors: Richard R Hudson
Journal: Bioinformatics Date: 2002-02 Impact factor: 6.937

2. Detecting recent positive selection in the human genome from haplotype structure.

Authors: Pardis C Sabeti; David E Reich; John M Higgins; Haninah Z P Levine; Daniel J Richter; Stephen F Schaffner; Stacey B Gabriel; Jill V Platko; Nick J Patterson; Gavin J McDonald; Hans C Ackerman; Sarah J Campbell; David Altshuler; Richard Cooper; Dominic Kwiatkowski; Ryk Ward; Eric S Lander
Journal: Nature Date: 2002-10-09 Impact factor: 49.962

3. rehh: an R package to detect footprints of selection in genome-wide SNP data from haplotype structure.

Authors: Mathieu Gautier; Renaud Vitalis
Journal: Bioinformatics Date: 2012-03-07 Impact factor: 6.937

4. Genome-wide detection and characterization of positive selection in human populations.

Authors: Pardis C Sabeti; Patrick Varilly; Ben Fry; Jason Lohmueller; Elizabeth Hostetter; Chris Cotsapas; Xiaohui Xie; Elizabeth H Byrne; Steven A McCarroll; Rachelle Gaudet; Stephen F Schaffner; Eric S Lander; Kelly A Frazer; Dennis G Ballinger; David R Cox; David A Hinds; Laura L Stuve; Richard A Gibbs; John W Belmont; Andrew Boudreau; Paul Hardenbol; Suzanne M Leal; Shiran Pasternak; David A Wheeler; Thomas D Willis; Fuli Yu; Huanming Yang; Changqing Zeng; Yang Gao; Haoran Hu; Weitao Hu; Chaohua Li; Wei Lin; Siqi Liu; Hao Pan; Xiaoli Tang; Jian Wang; Wei Wang; Jun Yu; Bo Zhang; Qingrun Zhang; Hongbin Zhao; Hui Zhao; Jun Zhou; Stacey B Gabriel; Rachel Barry; Brendan Blumenstiel; Amy Camargo; Matthew Defelice; Maura Faggart; Mary Goyette; Supriya Gupta; Jamie Moore; Huy Nguyen; Robert C Onofrio; Melissa Parkin; Jessica Roy; Erich Stahl; Ellen Winchester; Liuda Ziaugra; David Altshuler; Yan Shen; Zhijian Yao; Wei Huang; Xun Chu; Yungang He; Li Jin; Yangfan Liu; Yayun Shen; Weiwei Sun; Haifeng Wang; Yi Wang; Ying Wang; Xiaoyan Xiong; Liang Xu; Mary M Y Waye; Stephen K W Tsui; Hong Xue; J Tze-Fei Wong; Luana M Galver; Jian-Bing Fan; Kevin Gunderson; Sarah S Murray; Arnold R Oliphant; Mark S Chee; Alexandre Montpetit; Fanny Chagnon; Vincent Ferretti; Martin Leboeuf; Jean-François Olivier; Michael S Phillips; Stéphanie Roumy; Clémentine Sallée; Andrei Verner; Thomas J Hudson; Pui-Yan Kwok; Dongmei Cai; Daniel C Koboldt; Raymond D Miller; Ludmila Pawlikowska; Patricia Taillon-Miller; Ming Xiao; Lap-Chee Tsui; William Mak; You Qiang Song; Paul K H Tam; Yusuke Nakamura; Takahisa Kawaguchi; Takuya Kitamoto; Takashi Morizono; Atsushi Nagashima; Yozo Ohnishi; Akihiro Sekine; Toshihiro Tanaka; Tatsuhiko Tsunoda; Panos Deloukas; Christine P Bird; Marcos Delgado; Emmanouil T Dermitzakis; Rhian Gwilliam; Sarah Hunt; Jonathan Morrison; Don Powell; Barbara E Stranger; Pamela Whittaker; David R Bentley; Mark J Daly; Paul I W de Bakker; Jeff Barrett; Yves R Chretien; Julian Maller; Steve McCarroll; Nick Patterson; Itsik Pe'er; Alkes Price; Shaun Purcell; Daniel J Richter; Pardis Sabeti; Richa Saxena; Stephen F Schaffner; Pak C Sham; Patrick Varilly; David Altshuler; Lincoln D Stein; Lalitha Krishnan; Albert Vernon Smith; Marcela K Tello-Ruiz; Gudmundur A Thorisson; Aravinda Chakravarti; Peter E Chen; David J Cutler; Carl S Kashuk; Shin Lin; Gonçalo R Abecasis; Weihua Guan; Yun Li; Heather M Munro; Zhaohui Steve Qin; Daryl J Thomas; Gilean McVean; Adam Auton; Leonardo Bottolo; Niall Cardin; Susana Eyheramendy; Colin Freeman; Jonathan Marchini; Simon Myers; Chris Spencer; Matthew Stephens; Peter Donnelly; Lon R Cardon; Geraldine Clarke; David M Evans; Andrew P Morris; Bruce S Weir; Tatsuhiko Tsunoda; Todd A Johnson; James C Mullikin; Stephen T Sherry; Michael Feolo; Andrew Skol; Houcan Zhang; Changqing Zeng; Hui Zhao; Ichiro Matsuda; Yoshimitsu Fukushima; Darryl R Macer; Eiko Suda; Charles N Rotimi; Clement A Adebamowo; Ike Ajayi; Toyin Aniagwu; Patricia A Marshall; Chibuzor Nkwodimmah; Charmaine D M Royal; Mark F Leppert; Missy Dixon; Andy Peiffer; Renzong Qiu; Alastair Kent; Kazuto Kato; Norio Niikawa; Isaac F Adewole; Bartha M Knoppers; Morris W Foster; Ellen Wright Clayton; Jessica Watkin; Richard A Gibbs; John W Belmont; Donna Muzny; Lynne Nazareth; Erica Sodergren; George M Weinstock; David A Wheeler; Imtaz Yakub; Stacey B Gabriel; Robert C Onofrio; Daniel J Richter; Liuda Ziaugra; Bruce W Birren; Mark J Daly; David Altshuler; Richard K Wilson; Lucinda L Fulton; Jane Rogers; John Burton; Nigel P Carter; Christopher M Clee; Mark Griffiths; Matthew C Jones; Kirsten McLay; Robert W Plumb; Mark T Ross; Sarah K Sims; David L Willey; Zhu Chen; Hua Han; Le Kang; Martin Godbout; John C Wallenburg; Paul L'Archevêque; Guy Bellemare; Koji Saeki; Hongguang Wang; Daochang An; Hongbo Fu; Qing Li; Zhen Wang; Renwu Wang; Arthur L Holden; Lisa D Brooks; Jean E McEwen; Mark S Guyer; Vivian Ota Wang; Jane L Peterson; Michael Shi; Jack Spiegel; Lawrence M Sung; Lynn F Zacharia; Francis S Collins; Karen Kennedy; Ruth Jamieson; John Stewart
Journal: Nature Date: 2007-10-18 Impact factor: 49.962

5. Signals of recent positive selection in a worldwide sample of human populations.

Authors: Joseph K Pickrell; Graham Coop; John Novembre; Sridhar Kudaravalli; Jun Z Li; Devin Absher; Balaji S Srinivasan; Gregory S Barsh; Richard M Myers; Marcus W Feldman; Jonathan K Pritchard
Journal: Genome Res Date: 2009-03-23 Impact factor: 9.043

6. An integrated map of genetic variation from 1,092 human genomes.

Authors: Goncalo R Abecasis; Adam Auton; Lisa D Brooks; Mark A DePristo; Richard M Durbin; Robert E Handsaker; Hyun Min Kang; Gabor T Marth; Gil A McVean
Journal: Nature Date: 2012-11-01 Impact factor: 49.962

7. A map of recent positive selection in the human genome.

Authors: Benjamin F Voight; Sridhar Kudaravalli; Xiaoquan Wen; Jonathan K Pritchard
Journal: PLoS Biol Date: 2006-03-07 Impact factor: 8.029

7 in total

214 in total

1. Recent Positive Selection Drives the Expansion of a Schizophrenia Risk Nonsynonymous Variant at SLC39A8 in Europeans.

Authors: Ming Li; Dong-Dong Wu; Yong-Gang Yao; Yong-Xia Huo; Jie-Wei Liu; Bing Su; Daniel I Chasman; Audrey Y Chu; Tao Huang; Lu Qi; Yan Zheng; Xiong-Jian Luo
Journal: Schizophr Bull Date: 2015-05-25 Impact factor: 9.306

2. Antagonistic regulation of the gibberellic acid response during stem growth in rice.

Authors: Keisuke Nagai; Yoshinao Mori; Shin Ishikawa; Tomoyuki Furuta; Rico Gamuyao; Yoko Niimi; Tokunori Hobo; Moyuri Fukuda; Mikiko Kojima; Yumiko Takebayashi; Atsushi Fukushima; Yasuyo Himuro; Masatomo Kobayashi; Wataru Ackley; Hiroshi Hisano; Kazuhiro Sato; Aya Yoshida; Jianzhong Wu; Hitoshi Sakakibara; Yutaka Sato; Hiroyuki Tsuji; Takashi Akagi; Motoyuki Ashikari
Journal: Nature Date: 2020-07-15 Impact factor: 49.962

3. The Valdostana goat: a genome-wide investigation of the distinctiveness of its selective sweep regions.

Authors: Andrea Talenti; Francesca Bertolini; Giulio Pagnacco; Fabio Pilla; Paolo Ajmone-Marsan; Max F Rothschild; Paola Crepaldi
Journal: Mamm Genome Date: 2017-03-02 Impact factor: 2.957

4. Genetic signature of natural selection in first Americans.

Authors: Carlos Eduardo Amorim; Kelly Nunes; Diogo Meyer; David Comas; Maria Cátira Bortolini; Francisco Mauro Salzano; Tábita Hünemeier
Journal: Proc Natl Acad Sci U S A Date: 2017-02-13 Impact factor: 11.205

5. An Mtb-Human Protein-Protein Interaction Map Identifies a Switch between Host Antiviral and Antibacterial Responses.

Authors: Bennett H Penn; Zoe Netter; Jeffrey R Johnson; John Von Dollen; Gwendolyn M Jang; Tasha Johnson; Yamini M Ohol; Cyrus Maher; Samantha L Bell; Kristina Geiger; Guillaume Golovkine; Xiaotang Du; Alex Choi; Trevor Parry; Bhopal C Mohapatra; Matthew D Storck; Hamid Band; Chen Chen; Stefanie Jäger; Michael Shales; Dan A Portnoy; Ryan Hernandez; Laurent Coscoy; Jeffery S Cox; Nevan J Krogan
Journal: Mol Cell Date: 2018-08-16 Impact factor: 17.970

6. Host-Virus Arms Races Drive Elevated Adaptive Evolution in Viral Receptors.

Authors: Wenqiang Wang; Huayao Zhao; Guan-Zhu Han
Journal: J Virol Date: 2020-07-30 Impact factor: 5.103

7. A positively selected FBN1 missense variant reduces height in Peruvian individuals.

Authors: Samira Asgari; Yang Luo; Ali Akbari; Gillian M Belbin; Xinyi Li; Daniel N Harris; Martin Selig; Eric Bartell; Roger Calderon; Kamil Slowikowski; Carmen Contreras; Rosa Yataco; Jerome T Galea; Judith Jimenez; Julia M Coit; Chandel Farroñay; Rosalynn M Nazarian; Timothy D O'Connor; Harry C Dietz; Joel N Hirschhorn; Heinner Guio; Leonid Lecca; Eimear E Kenny; Esther E Freeman; Megan B Murray; Soumya Raychaudhuri
Journal: Nature Date: 2020-05-13 Impact factor: 49.962

8. New Insights into the Genetic Basis of Monge's Disease and Adaptation to High-Altitude.

Authors: Tsering Stobdan; Ali Akbari; Priti Azad; Dan Zhou; Orit Poulsen; Otto Appenzeller; Gustavo F Gonzales; Amalio Telenti; Emily H M Wong; Shubham Saini; Ewen F Kirkness; J Craig Venter; Vineet Bafna; Gabriel G Haddad
Journal: Mol Biol Evol Date: 2017-12-01 Impact factor: 16.240

9. PGG.SNV: understanding the evolutionary and medical implications of human single nucleotide variations in diverse populations.

Authors: Chao Zhang; Yang Gao; Zhilin Ning; Yan Lu; Xiaoxi Zhang; Jiaojiao Liu; Bo Xie; Zhe Xue; Xiaoji Wang; Kai Yuan; Xueling Ge; Yuwen Pan; Chang Liu; Lei Tian; Yuchen Wang; Dongsheng Lu; Boon-Peng Hoh; Shuhua Xu
Journal: Genome Biol Date: 2019-10-22 Impact factor: 13.583

10. A thrifty variant in CREBRF strongly influences body mass index in Samoans.

Authors: Ryan L Minster; Nicola L Hawley; Chi-Ting Su; Guangyun Sun; Erin E Kershaw; Hong Cheng; Olive D Buhule; Jerome Lin; Muagututi'a Sefuiva Reupena; Satupa'itea Viali; John Tuitele; Take Naseri; Zsolt Urban; Ranjan Deka; Daniel E Weeks; Stephen T McGarvey
Journal: Nat Genet Date: 2016-07-25 Impact factor: 38.330