Ravi Valluru1, Elodie E Gazave2, Samuel B Fernandes3, John N Ferguson3,4, Roberto Lozano2, Pradeep Hirannaiah3,4, Tao Zuo5, Patrick J Brown6, Andrew D B Leakey3,4, Michael A Gore2, Edward S Buckler5,2,7, Nonoy Bandillo1. 1. Institute for Genomic Diversity, Cornell University, Ithaca, New York 14853 rv285@cornell.edu nb549@cornell.edu. 2. Plant Breeding and Genetics Section, School of Integrative Plant Science, Cornell University, Ithaca, New York 14853. 3. Department of Plant Biology, Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Illinois. 4. Department of Crop Sciences, Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Illinois 61801. 5. Institute for Genomic Diversity, Cornell University, Ithaca, New York 14853. 6. Section of Agricultural Plant Biology, Department of Plant Sciences, University of California, Davis, California 95616. 7. United States Department of Agriculture, Agricultural Research Service, R. W. Holley Center, Ithaca, New York 14853.
Abstract
Sorghum (Sorghum bicolor L.) is a major food cereal for millions of people worldwide. The sorghum genome, like other species, accumulates deleterious mutations, likely impacting its fitness. The lack of recombination, drift, and the coupling with favorable loci impede the removal of deleterious mutations from the genome by selection. To study how deleterious variants impact phenotypes, we identified putative deleterious mutations among ∼5.5 M segregating variants of 229 diverse biomass sorghum lines. We provide the whole-genome estimate of the deleterious burden in sorghum, showing that ∼33% of nonsynonymous substitutions are putatively deleterious. The pattern of mutation burden varies appreciably among racial groups. Across racial groups, the mutation burden correlated negatively with biomass, plant height, specific leaf area (SLA), and tissue starch content (TSC), suggesting that deleterious burden decreases trait fitness. Putatively deleterious variants explain roughly one-half of the genetic variance. However, there is only moderate improvement in total heritable variance explained for biomass (7.6%) and plant height (average of 3.1% across all stages). There is no advantage in total heritable variance for SLA and TSC. The contribution of putatively deleterious variants to phenotypic diversity therefore appears to be dependent on the genetic architecture of traits. Overall, these results suggest that incorporating putatively deleterious variants into genomic models slightly improves prediction accuracy because of extensive linkage. Knowledge of deleterious variants could be leveraged for sorghum breeding through either genome editing and/or conventional breeding that focuses on the selection of progeny with fewer deleterious alleles.
Sorghum (Sorghum bicolor L.) is a major food cereal for millions of people worldwide. The sorghum genome, like other species, accumulates deleterious mutations, likely impacting its fitness. The lack of recombination, drift, and the coupling with favorable loci impede the removal of deleterious mutations from the genome by selection. To study how deleterious variants impact phenotypes, we identified putative deleterious mutations among ∼5.5 M segregating variants of 229 diverse biomass sorghum lines. We provide the whole-genome estimate of the deleterious burden in sorghum, showing that ∼33% of nonsynonymous substitutions are putatively deleterious. The pattern of mutation burden varies appreciably among racial groups. Across racial groups, the mutation burden correlated negatively with biomass, plant height, specific leaf area (SLA), and tissue starch content (TSC), suggesting that deleterious burden decreases trait fitness. Putatively deleterious variants explain roughly one-half of the genetic variance. However, there is only moderate improvement in total heritable variance explained for biomass (7.6%) and plant height (average of 3.1% across all stages). There is no advantage in total heritable variance for SLA and TSC. The contribution of putatively deleterious variants to phenotypic diversity therefore appears to be dependent on the genetic architecture of traits. Overall, these results suggest that incorporating putatively deleterious variants into genomic models slightly improves prediction accuracy because of extensive linkage. Knowledge of deleterious variants could be leveraged for sorghum breeding through either genome editing and/or conventional breeding that focuses on the selection of progeny with fewer deleterious alleles.
Plant genomes continually accumulate new mutations due to population demographichistory (Brandvain ), random drift (Lynch and Gabriel 1990), the mating system (Hartfield and Glémin 2014), domestication (Lu ; Ramu ), and linked selection due to genetic interactions (Felsenstein 1974). While a sizeable portion of such new mutations are neutral (Shaw ; Covert ), a small portion of new mutations are likely to be deleterious because they disrupt evolutionarily conserved sites, protein function (Yampolsky ; Doniger ), or gene expression (Kremling ) in a way that results in negative impacts on fitness. The elimination of deleterious mutations from breeding populations has therefore been suggested as a prospective avenue for crop improvement (Morrell ; Moyers ).Sorghum (Sorghum bicolor L., 2n = 20) is an important and versatile crop that is grown for food, forage, and fuel. It was domesticated from its wild ancestor ∼8000 years ago in Africa (Wendorf ). Five major morphological forms have traditionally been recognized: bicolor, caudatum, durra, guinea, and kafir. While these races are widespread in distinct regions of Africa, reflecting the diverse agro-ecological environments (Dillon ; Evans ), sorghum has maintained minimal genome redundancy due to the absence of any whole-genome duplication for > 70 MY (Paterson , 2009). However, inbreeding sorghum is likely to accumulate more weakly deleterious mutations when compared to an outcrossing species, which accumulates strong recessive deleterious mutations that reduce the mean fitness of the species over time (Moyers ). Nonetheless, there is accumulating evidence showing that enhanced homozygosity (Kumaravadivel and Rangasamy 1994), relaxed selection (Arunkumar ), and low levels of outcrossing (Pamilo ; Nakayama ) can act to purge deleterious mutations leading to lower mutation burden in selfing populations. Though the relative contributions of these processes to mutation burden has long been debated, both theoretical and experimental evidence suggests that reduced population size effects usually outcompete processes that enhance the purging of deleterious mutations caused by selfing (Bustamante ; Slotte , 2013; Arunkumar ), leading to an influx of deleterious mutations into selfing species.Modern breeding and domestication results in an increased mutation burden in domesticates when compared to their wild progenitors, and a decreased mutation burden in elite cultivars when compared to landraces (Gaut ; Ramu ; Yang ). The demographic history and inbreeding allow deleterious variants of weaker effect to reach appreciable frequencies owing to random drift, which can contribute significantly to mutation burden and affect fitness-related traits (Kono ). An estimated 20–30% of nonsynonymous variants are deleterious in rice (Lu ), Arabidopsis (Günther and Schmid 2010), maize (Mezmouk and Ross-Ibarra 2014), and cassava (Ramu ). Renaut and Rieseberg (2015) identified an excess of nonsynonymous single-nucleotide polymorphisms (SNPs) segregating in domesticated sunflower and globe artichoke relative to natural populations. Similarly, ∼20–40% of protein-coding SNPs are predicted to have a deleterious allele in maize (Mezmouk and Ross-Ibarra 2014). Indeed, deleterious mutations are predicted to be enriched near regions of strong selection (Chun and Fay 2011; Gaut ; Kono ), pointing to a potentially important role for deleterious variants in shaping agronomic phenotypes.Genomic selection (GS) can help to accelerate crop breeding when compared to conventional phenotype-based selection approaches. In genome-wide prediction (GWP) models employed in GS, the genetic variance is modeled by accounting for either the biological additive or dominant effects of the markers that can potentially improve the prediction accuracy of phenotypic traits (Vitezica , 2016). Genes associated with complex traits carry an uncertain number of deleterious mutations distributed across the genome, and such a mutation burden contributes significantly to the total phenotypic variation of traits (Yang ). Because deleterious mutations can occur in both homozygous and heterozygous states depending on the genetic context, trait-specific and genetic-context based GWP models could be expected to capture the phenotypic effects of deleterious mutations. Therefore, GWP models encompassing deleterious variants are expected to account for the total genetic contribution to and improve the prediction accuracy of complex traits (Yang ). However, the improvement of GWP will depend on how strongly correlated deleterious variants are to all other variants.In this study, we examine the contribution of putatively deleterious variants to phenotypic variation in sorghum. We used a racially, geographically, and phenotypically diverse biomass sorghum population that represents the ancestry of four major sorghum types (Brenton ). All accessions were phenotyped for two agronomic traits, dry biomass (DBM) and plant height (PH), and for two physiological traits, specific leaf area (SLA) and tissue starch content (TSC), under field conditions. We performed whole-genome resequencing (WGS) on 229 sorghum lines and identified putative deleterious mutations in the genome. The main objectives of this study were to determine (1) whether empirical patterns of deleterious mutation burden differ among sorghum racial groups, and (2) whether deleterious variants improve prediction accuracy of complex traits and, if so, whether such accuracy differs among phenotypic traits that have different genetic architecture. To address these questions, we first identified the putative deleterious mutations and their biological effect sizes, and then estimated an individual mutation burden and its relationship with phenotypic traits. Taking advantage of a Bayesian GS framework (Habier ), we tested the biological significance of deleterious variants on prediction of DBM, PH, SLA, and TSC.
Materials and Methods
Plant material, field experiments, and phenotypic data
A biomass sorghum diversity panel assembled for the Transportation Energy Resource from Renewable Agriculture-Mobile Energy-Crop Phenotyping Platform (TERRAMEPP) and Transportation Energy Resource from Renewable Agriculture-Water Efficient Sorghum Technologies (TERRA-WEST) projects was used in this study. This panel was composed of 869 lines: 339 lines coming from Fernandes , 117 lines coming from Brenton , 273 lines coming from Yu , and 140 additional lines obtained from John Burke (United States Department of Agriculture, Lubbock, TX). Although phenotypic data for the entire panel were collected, only a subset of 229 lines for which WGS data were available were used in the study. These 229 lines belong to four major races of sorghum (caudatum, durra, guinea, and kafir) with representatives from the African continent, Asia, and the Americas (Supplemental Material, Figure S1).Field experiments were conducted in Illinois during 2016 in an augmented block design that consisted of 960 four-row plots with a row length of 3 m, 1.5 m alleys and 0.76 m row spacing. All plots were arranged in 40 rows and 24 columns. Target density of the plant population was ∼270,368 plants ha−1, and experiments were planted in late May and harvested in early October. PH was measured from the ground to the uppermost leaf whorl at seven developmental stages starting 4 weeks after planting (WAP) up to 16 WAP, with an interval of 2 weeks (seven stages), and averaged across the plot. Biomass data were collected at harvest using a four-row Kemper head attached to a John Deere 5830 tractor. A plot sampler equipment with a near infrared sensor (model 130S, RCI Engineering) was used to measure the wet weight of total biomass (lb), and to quantify biomass moisture (%) and starch (%) contents of plants (Li ) in the two middle rows of each four-row plot. Biomass yield in dry U.S. tons per acre was calculated as: dry U.S. tons per acre = total plot wet weight (lb) × (1−plot moisture) / (plot area in acre) × 0.0005. Because some accessions had flowered (38 accessions), flowering data were recorded in 2018 (flowering data were not available for 2016). We conducted an additional set of analyses that had excluded these 38 accessions to assess the potential confounding effect of flowering time on PH.To estimate SLA, the youngest fully expanded leaves from two randomly selected plants of the middle two rows of each plot were excised just above the ligule 60–70 days after planting. Damaged leaves were avoided. Excised leaves were then recut under water and the cut surface kept immersed. In the laboratory, three 1.6-cm leaf discs were collected from the middle of each leaf while avoiding the midrib. Leaf discs were immediately transferred to an oven set at 60° for 2 weeks. The dry mass of leaf discs was determined and SLA was expressed as the ratio of fresh leaf area to dry leaf mass (cm2 g−1). Considering a 10-day interval among the SLA sampling, we used “date of sampling” as a term in the model to generate best linear unbiased predictors (BLUPs).
Statistical analysis of phenotypic data
Phenotypic data analysis was conducted according to experimental design, which consisted of a series of incomplete blocks connected through common checks. The following model was used to generate BLUPs for all genotypes included in the field trial:where μ is the overall mean, g is the random effect of the ith genotype, e is the random effect of the jth environment, b() is the random effect of the kth incomplete block nested within the jth location, ge represents the effect of genotype-by-environment interaction, and ε is the residual error for the ith genotype in the kth incomplete block in the jth location.For SLA, we fitted another model that accounted for the sampling date:where μ is the overall mean, g is the random effect of the ith genotype, e is the random effect of the jth environment, b() is the random effect of the kth incomplete block nested within the jth environment, d() is the random effect of the lth sampling date nested within kth incomplete block and the jth location, ge represents the effect of genotype-by-environment interaction, and ε is the residual error for the ith genotype in the kth incomplete block and lth sampling date in the jth environment.For the purpose of estimating the broad-sense heritability (H) of each phenotype, we estimated variance components using the restricted maximum likelihood. All effects were assumed to be random. Broad-sense heritability on an entry-mean basis was calculated as H = σ2G/(σ2G + σ2GXE/number of locations + σ2e/number of environments × number of replicates), where σ2G is the variance among accessions, σ2GXE is the accession-by-environment variance, and σ2e is the error variance. All analyses were conducted in R software (R Development Core Team 2015).
Genotyping
Genomic DNA (gDNA) was extracted using the cetyl trimethylammonium bromide (CTA) method and quantified using picogreen (Molecular Probes, Eugene, OR) on a microplate reader of Synergy HT (BioTek, Winooski, VT). After preprocessing steps of the gDNA samples, 10 libraries were prepared (24 samples in each library) and sequenced on HiSeq 4000 (PE_2x150) using sequencing kit version 1. Fastq files were demultiplexed with the bcl2fastq v2.17.1.14 conversion software of Illumina. We used Sentieon Genomics Pipeline DNA sequencing (Freed ) and a series of custom bash scripts to process the raw reads. Briefly, fastq files were aligned to the Sorghum bicolor reference genome version 3.1 (https://phytozome.jgi.doe.gov). PCR duplicates were removed, base quality was recalibrated based on a “known SNPs” file, and recalibrated files were processed through the Haplotype Caller (HC). No realignment around insertions/deletions was performed. The data set therefore contained 239 samples, corresponding to 229 unique accessions, of which seven had one or two replicates.To create a list of known SNPs for the recalibration step, the HC pipeline was run without recalibration on the list of 239 BAM files. The output was filtered removing SNPs that had a number of heterozygote genotypes across all accessions > 10% and/or a number of heterozygote genotypes more than two times the number of minor alleles [hereafter referred to as “homozygosity-based filter” (Chia )]. In addition, “SNP clusters,” defined as three or more SNPs located within 5 bp were also filtered out. Clusters of SNPs are often generated by misalignment and were conservatively considered as spurious. The filtered list of SNPs was used as known SNPs to recalibrate the BAM files and to generate a final list of SNPs. The vcf file generated by the HC contained biallelic SNPs (n = 22,359,733) and was further filtered to only retain SNPs with at least 4× coverage (n = 21,865,512), and with a nonmissing genotype in ≤ 40% of the samples (n = 14,535,156). After removing SNP clusters and applying homozygosity-based filters, the final data set contained 5,512,653 SNPs that were used for further analyses.
Identifying putatively deleterious mutations
The substitution of amino acid effect on protein function was predicted with the Sorting Intolerant From Tolerant (SIFT) algorithm (Vaser ). A nonsynonymous mutation with a SIFT score < 0.05 was defined as a putative deleterious mutation. To identify a higher-confidence set of deleterious mutations, we used genomic evolutionary rate profiling (GERP > 2) (Davydov ) estimated from a multi-species whole-genome alignment of six species including Zea mays, Oryza sativa, Setaria italica, Brachypodium distachyon, Hordeum vulgare, and Musa acuminate. We therefore used both an estimate of sequence conservation (GERP > 2) and protein conservation (SIFT < 0.05) to identify more conservative deleterious mutations (hereafter H(high-confidence)GERPDEL-SNPs) in constrained portions of the genome. Using these HGERPDEL-SNPs, we estimated the mutation burden, which was defined as the number of derived deleterious alleles carried by an individual divided by the total number of nonmissing alleles (Vitezica ), based on a putative derived deleterious allele that was defined as a minor allele in the multi-species alignment (Yang ). First, we counted the total number of deleterious alleles in a given genotype. Here, each allele was given a score of 0.5. If both were deleterious alleles at a given position, we counted them as 1 (0.5 for each allele). If only one allele was deleterious, then it was counted as 0.5. We summed all these homozygous (1’s) and heterozygous (0.5’s) deleterious alleles. Second, we counted the total number of alleles used to score deleterious alleles in a given genotype. Finally, the total number of deleterious alleles was divided by the total number of scored alleles, and the resulting ratio was defined as the mutation burden.To account for the effects of linkage, we calculated linkage disequilibrium (LD) between SNPs and identified random variants (nondeleterious) to be used as a control set to compare with deleterious variants. A subset of 100,000 random SNP markers were selected and all possible pairwise r2 values were calculated using Plink 1.9 (Chang ). Using the 1% of all the possible pairwise calculations, we calculated the relationship of distance between markers and r2. To define local LD structure across each chromosome, we also calculated the mean LD score (Bulik-Sullivan ) for each marker. LD scores were calculated with a window of 1 Mb using the software GCTA (Yang ; Bulik-Sullivan ). Each LD score was divided by the total number of SNPs within each window (Figure S2). To identify SNPs in high LD with deleterious variants, we first explored the effect of window size and r2 threshold on the number of SNPs selected (Figure S3). Given the LD pattern observed, we used a window size of 250 kb and an r2 threshold of 0.9, meaning that if any marker within 250 kb of a deleterious variants has an r2 ≥ 0.9, it would be excluded from further analysis. This yielded a list of ∼1 million SNPs that were in LD with deleterious SNPs, which were excluded from all SNPs. An equal proportion of 100 sets of random variants with the similar allele frequency range of deleterious variants were selected (Figure S4).
Estimating effect sizes of deleterious and nondeleterious variants
Despite the different assumptions in genetic architecture made by the different models, and the fact that the QTL effects are not of equal size and have different genetic architectures, the simplest model ridge regression (RR)-BLUP often performs just as well in extensive cross-validation and empirical studies. Unless indicated otherwise, effect sizes were estimated using the RR-BLUP model implemented in the R-package rrBLUP version 4.2 (Endelman 2011). We fitted the model y = μ + Zu + e, where y is a vector of BLUPs of phenotype; μ is an intercept vector; and Z is an n × p incidence matrix (either deleterious or random variants) containing the allelic states of the p marker loci (z = {−1, 0, 1}), where −1 represents the minor allele; u is the p × 1 vector of marker effects; and e is a n × 1 vector of residuals. Under RR-BLUP, u ∼ MVN (0, Iσ2u) where σ2u is the variance of the common distribution of marker effects and was estimated using restricted maximum likelihood.
Partitioning of genetic variance and GWP
We compared the variance explained by deleterious variants to that of an equal proportion of randomly sampled variants from the distribution of nondeleterious variants. Following the method of Brenton , we used a two-dimensional sampling approach to create 100 equal-sized data sets of randomly sampled variants matched for minor allele frequency. For each trait, we fitted the model separately for each variant set (either deleterious variant or nondeleterious variant) and estimated the phenotypic variance explained.For each variant set (deleterious variant vs. nondeleterious set), we fitted a standard genomic (G)BLUP model including only additive effects by fitting a linear mixed model of the following form: y = Zg + e, where y is a vector of BLUPs for the phenotype, the vector g is a random effect, the BLUP represents the genomic estimated breeding values (GEBV) for each individual, Z is a design matrix indicating observations of genotype identities, and e is a vector of residuals. The GEBV were obtained by assuming g ∼ MVN (0, Kσ2g), where σ2g is the additive genetic variance and K is the square genomic relationship matrix based on SNP data, implemented in TASSEL (Bradbury ). Predictive abilities for all traits were evaluated using a fivefold cross-validation approach repeated 100 times and were implemented in the R statistical software.
Data availability
Genotypic data is available in CyVerse (doi: https://doi.org/10.25739/6yts-xq12). Phenotypic data is available at bitbucket (https://bitbucket.org/bucklerlab/sorghum_geneticload/src/master). Supplemental material available at Figshare: https://doi.org/10.25386/genetics.7638122.
Results
Around 33% of nonsynonymous substitutions are putatively deleterious
We resequenced the whole genome of 229 diverse biomass sorghum accessions, belonging to four racial groups that were selected to be representative of diverse geographical regions (Figure S1) (Brown ; Thurber ). The mean sequencing depth was 5.8×, resulting in a data set consisting of ∼5.5 M SNPs. Out of 5.5 M SNPs, ∼6.3% of SNPs are located in coding regions. To determine the distribution of putatively deleterious SNPs in coding regions of the sorghum genome, we first annotated deleterious SNPs using a SIFT score (SIFT < 0.05) that predicts an amino acid substitution effect on protein function (Vaser ). Based on SIFT score < 0.05, we find that ∼33% of the total nonsynonymous substitutions are putatively deleterious (average SIFT score of 0.08), while 67% are predicted as tolerated mutations (average SIFT score of 0.47). We estimated the “derived allele” frequency (DAF) spectrum, with the derived allele defined as a minor allele in the multi-species sequence alignment (Yang ). Our results reveal that a large proportion of deleterious SNPs have a lower DAF (< 0.05; Figure 1a). While DAF shows a negative association with GERP scores (Figure 1b) (Yang ), it has a positively associated pattern with SIFT scores (Figure S5).
Figure 1
Deleterious mutations in the sorghum genome. (a) Site allele-frequency spectrum of nonsynonymous deleterious mutations and synonymous mutations in the sorghum genome. The derived allele frequency (DAF) distribution of alleles is shown where a minor allele in the multi-species alignment was considered as a derived deleterious allele (Yang ). (b) The allele frequency of the derived alleles in bins of different genomic evolutionary rate profiling (GERP) scores. The vertical bars in (b) indicate SE.
Deleterious mutations in the sorghum genome. (a) Site allele-frequency spectrum of nonsynonymous deleterious mutations and synonymous mutations in the sorghum genome. The derived allele frequency (DAF) distribution of alleles is shown where a minor allele in the multi-species alignment was considered as a derived deleterious allele (Yang ). (b) The allele frequency of the derived alleles in bins of different genomic evolutionary rate profiling (GERP) scores. The vertical bars in (b) indicate SE.We then combined GERP (> 2) and SIFT (< 0.05) scores to identify a higher-confidence set of deleterious SNPs (HGERPDEL-SNPs; Figure S6). Unless otherwise indicated, all further analyses were performed using HGERPDEL-SNPs. While the majority of HGERPDEL-SNPs had an average SIFT score of < 0.01 (Figure S6a), they also showed a low overall allele frequency (average minor allele frequency = 0.07, Figure S6c) that is consistent with population genetic expectations. All identified HGERPDEL-SNPs show comparably similar distributions among all chromosomes (P = 0.34; Figure S6b) and arise from noncentromeric regions of the chromosomes (Figure S7). Our results corroborate previous studies showing that selection acts on deleterious variants to keep them rare (Mezmouk and Ross-Ibarra 2014), and support a combined use of SIFT and GERP scores (Figure 1) as effective quantitative measures of an observed variant for its long-term fitness consequences (Yang ).
Both deleterious and nondeleterious variants exhibit different effect size distributions
We estimated the additive effect sizes explained by HGERPDEL-SNPs for all phenotypic traits. An equal number of nondeleterious variants were used as a control, which are not in LD but have a similar minor allele frequency spectrum of HGERPDEL-SNPs across the genome (Figure S4). We compared the full density distribution of the effect sizes of both HGERPDEL-SNPs and nondeleterious variants to avoid the winner’s curse (Zöllner and Pritchard 2007; Jun ) and examined whether HGERPDEL-SNPs effect sizes are overall higher in magnitude compared to nondeleterious variants (Figure 2).
Figure 2
Smoothed estimate of density distribution of regression coefficients associated with highly conserved Del variants (HGERPDEL-SNPs) and nondeleterious variants for 10 phenotypic traits [(a) biomass, (b) specific leaf area, (c) tissue starch content, and (d–j) plant height 4, 6, 8, 10, 12, 14, and 16 weeks after planting]. Del, deleterious; GERP, genomic evolutionary rate profiling.
Smoothed estimate of density distribution of regression coefficients associated with highly conserved Del variants (HGERPDEL-SNPs) and nondeleterious variants for 10 phenotypic traits [(a) biomass, (b) specific leaf area, (c) tissue starch content, and (d–j) plant height 4, 6, 8, 10, 12, 14, and 16 weeks after planting]. Del, deleterious; GERP, genomic evolutionary rate profiling.Our results show that the density distribution of the effect sizes of both HGERPDEL-SNPs and nondeleterious variants follow a similar pattern, albeit showing some subtle differences in the density peak and distribution. The density distribution of HGERPDEL-SNPs extends much farther than the distribution of nondeleterious variants, both at the highest and lowest range of distribution (Figure 2), which is similar to the results of previous studies (Zöllner and Pritchard 2007; Jun ). While such density distributions are consistent across all traits, HGERPDEL-SNPs show different density peaks compared to nondeleterious variants. For some traits, HGERPDEL-SNPs show reduced-density peaks while for height at 4WAP, HGERPDEL-SNPs show higher-density peaks compared to nondeleterious variants (Figure 2, a–j).We then compared the empirical cumulative distribution of effect sizes of HGERPDEL-SNPs and nondeleterious variants. Using the two-sample Kolmogorov–Smirnov test, we demonstrate that the effect sizes of both HGERPDEL-SNPs and nondeleterious variants show different density patterns for all phenotypes studied (Figure S8). This suggests that HGERPDEL-SNPs have more variable effect sizes compared to nondeleterious variants for all phenotypic traits. Indeed, the observed variance for estimated effects across all traits was twofold higher for HGERPDEL-SNPs, suggesting that HGERPDEL-SNPs have substantially larger and more subtle effects overall.We also compared the means of folded distributions of both HGERPDEL-SNPs and nondeleterious variants. Across all phenotypes, HGERPDEL-SNPs have on average 30.14% (ranging 0–42.34%) higher effects than those observed for nondeleterious variants (Figure 3 and Figure S9). The average effect sizes captured by HGERPDEL-SNPs therefore appear to have greater effect sizes than the average effect sizes explained by nondeleterious variants, which are consistent with the previous results observed in maize (Yang ), humans (Marouli ; Jun ), and mice (Ji ).
Figure 3
Bar plots of means of folded distributions of effect sizes of highly conserved deleterious variants (HGERPDEL-SNPs) and nondeleterious variants for 10 phenotypic traits [(a) biomass, (b) specific leaf area, (c) tissue starch content, and (d–j) plant height 4, 6, 8, 10, 12, 14, and 16 weeks after planting]. GERP, genomic evolutionary rate profiling. D, deleterious; ND, nondeleterious.
Bar plots of means of folded distributions of effect sizes of highly conserved deleterious variants (HGERPDEL-SNPs) and nondeleterious variants for 10 phenotypic traits [(a) biomass, (b) specific leaf area, (c) tissue starch content, and (d–j) plant height 4, 6, 8, 10, 12, 14, and 16 weeks after planting]. GERP, genomic evolutionary rate profiling. D, deleterious; ND, nondeleterious.
Deleterious mutation burden varies among racial groups and negatively correlates with phenotypes
We estimated the mutation burden based on HGERPDEL-SNPs as the count of derived deleterious alleles carried by an individual divided by the total number of scored (nonmissing) alleles (see Materials and Methods; Figure 4). This reveals a substantial variation for mutation burden among racial groups (P = 3.14 × 10−05) based on the HGERPDEL-SNPs (Figure 4a). We observed that the caudatum group is significantly higher, with an average of 36%, for homozygous mutation burden as compared to other racial groups. Compared to the median burden across all racial groups, the guinea group has a proportionately lower burden (−20%), while the caudatum group has a proportionately higher burden (+49%). On average, an individual typically carries 0.0112 (SD 0.006), 0.0124 (SD 0.006), 0.0140 (SD 0.006), and 0.0178 (SD 0.007) mutation burden in the homozygous state in the guinea, durra, kafir, and caudatum groups, respectively. Across all racial groups, individual mutation burden ranges from 0.001 to 0.038 based on the HGERPDEL-SNPs, suggesting that all racial groups showed variable mutation burden.
Figure 4
Homozygous mutation burden in sorghum. (a) Homozygous mutation burden estimated for different racial groups of sorghum based on highly conserved deleterious variants (HGERPDEL-SNPs). The derived allele is defined as a minor allele from multi-species sequence alignments (Yang ). The mutation burden was estimated as the count of derived deleterious alleles carried by an individual divided by the total number of scored (nonmissing) alleles. The horizontal broken line indicates the mean of homozygous mutation burden across all racial groups. (b) Scatter plots of homozygous mutation burden and PC1 derived from genome-wide SNP markers. The black circles indicate the median values for each group. GERP, genomic evolutionary rate profiling; PC, principal coordinate.
Homozygous mutation burden in sorghum. (a) Homozygous mutation burden estimated for different racial groups of sorghum based on highly conserved deleterious variants (HGERPDEL-SNPs). The derived allele is defined as a minor allele from multi-species sequence alignments (Yang ). The mutation burden was estimated as the count of derived deleterious alleles carried by an individual divided by the total number of scored (nonmissing) alleles. The horizontal broken line indicates the mean of homozygous mutation burden across all racial groups. (b) Scatter plots of homozygous mutation burden and PC1 derived from genome-wide SNP markers. The black circles indicate the median values for each group. GERP, genomic evolutionary rate profiling; PC, principal coordinate.Given that there is a considerable amount of admixture present in sorghum lines, we checked if admixture influenced the mutation burden estimation among racial groups. We plotted the relationship between the homozygous mutation burden and the principal components derived from genome-wide SNP markers (Figure 4b and Figure S10). This shows that although there are admixed lines, a tendency toward a higher and lower mutation burden was observed for the caudatum and guinea groups, respectively (Figure 4, a and b). These results indicate that the deleterious mutation burden estimated based on the derived deleterious allele is largely due to the genomic architecture of racial groups, while it is less biased with admixture.We further evaluated the underlying relationship of mutation burden with phenotypic traits. Four putative phenotypic fitness traits were selected for this study: DBM, PH (seven developmental stages), SLA, and TSC. We selected these traits because total biomass has been explicitly used as an index of fitness in several species, as it can integrate the overall capacity for survival and reproduction (Donovan ; Younginger ). PH is an ecological fitness trait that incorporates processes for coexistence along spectra of light gradients (Falster and Westoby 2003). SLA is generally regarded as a useful summary ecological trait that often strongly correlates with many key plant attributes of ecological interest (Westoby 1998; Meziane and Shipley 1999). Starch production and its utilization on the diurnal basis, and its role under diverse growth conditions, is regarded as a major integrator in the regulation of plant growth and hence can be considered as a determinant of plant fitness (Sulpice ; Thalmann and Santelia 2017).We observed a substantial phenotypic variation for all traits among racial groups [Figure S11, biomass: P < 0.001; SLA: P < 0.001; starch: P < 0.05; height: P = 5.9e−5 (4WAP), P = 0.04 (6WAP), P = 3.1e−6 (8WAP), P = 3.9e−6 (10WAP), P = 7.5e−5 (12WAP), P = 0.001 (14WAP), and P < 0.05 (16WAP)], with highly heritable variation observed for PH [H2 = 0.87 (at 10 WAP)] and biomass (H2 = 0.73), consistent with previous studies (Brenton ). We also found strong correlations among traits (Figure S12).Using a simple linear regression model between mutation burden and phenotypic traits across all racial groups, we consistently found a negative relationship of mutation burden with all phenotypic traits (Table S1). We also performed a grouped regression combining racial groups that show parallel response, and show that the combined slopes further confirmed the negative correlations between mutation burden and phenotypes (Table S1). These results suggest that deleterious variants decrease trait fitness. However, the majority of these correlations are not significant except for PH (in case of grouped regression only), indicating that the deleterious mutation burden can be strongly linked to the variation in PH in the biomass sorghum lines studied.
Deleterious variants contribute considerably to phenotypic variation but vary substantially among traits
We tested whether incorporating putatively deleterious variants could inform GS models and improve the GWP of phenotypes. HGERPDEL-SNPs identified from WGS were used as priors and integrated into a genomic prediction framework (Figure 5). We quantified the amount of genetic variance, heritability, and model improvement by deleterious variants, and compared them with those of random variants. Based on a variance partitioning approach with a two-kernel model (see Materials and Methods), the model with putatively HGERPDEL-SNPs explained roughly half of the genetic variance [biomass: 52%, SLA: 48%, starch: 46%, and PH: 45–49% (across all stages)] (Figure 5). There was a modest improvement in total heritable variance explained for biomass (7.6%, h2 = 0.24 against 0.22 for random variants) and PH (3.1%, h2 = 0.33 against 0.32 for random variants across seven developmental stages). However, there was no advantage regarding heritable variance for SLA and TSC (Figure 6, a and b) for HGERPDEL-SNPs as compared to random variants.
Figure 5
Heritability estimates for all traits using a two-kernel model. M, model; PH4–16, plant height at 4, 6, 8, 10, 12, 14, and 16 weeks after planting; SLA, specific leaf area; TSC, tissue starch content.
Figure 6
Genome-wide prediction models incorporating putatively deleterious variants. (a and b) Heritability estimates for all traits using a single-kernel model. Heritability estimates for nondeleterious variants are derived based on 100 independent sets that are randomly chosen across the genome from variants that are not in linkage disequilibrium with deleterious variants. (c and d) Boxplots showing a fivefold cross-validation prediction ability estimation for deleterious variants and random variants. Del, deleterious; SLA, specific leaf area; TSC, tissue starch content; WAP, weeks after planting.
Heritability estimates for all traits using a two-kernel model. M, model; PH4–16, plant height at 4, 6, 8, 10, 12, 14, and 16 weeks after planting; SLA, specific leaf area; TSC, tissue starch content.Genome-wide prediction models incorporating putatively deleterious variants. (a and b) Heritability estimates for all traits using a single-kernel model. Heritability estimates for nondeleterious variants are derived based on 100 independent sets that are randomly chosen across the genome from variants that are not in linkage disequilibrium with deleterious variants. (c and d) Boxplots showing a fivefold cross-validation prediction ability estimation for deleterious variants and random variants. Del, deleterious; SLA, specific leaf area; TSC, tissue starch content; WAP, weeks after planting.We addressed the potential confounding effects of flowering on PH. We performed heritability estimates based on nonflowered lines (all flowered lines were excluded) within and across racial groups. We observed only minor nonsignificant differences on heritability and these model results are complementary to the model results obtained using all genotypes (Figure S13).To evaluate the predictive ability, we performed a fivefold cross-validation, repeated 100 times, which was implemented in a GBLUP model with either the HGERPDEL-SNPs or the nondeleterious SNP data sets. Consistent with the results of heritability, we observed 8.1 and 7.0% improvements on predictive ability for biomass and PH (at 10–16WAP only), respectively, while there was no improvement for SLA, TSC, or PH at early stages (at 4–8WAP, Figure 6, c and d). These results suggest that the contribution of putative HGERPDEL-SNPs to phenotypic variation varies considerably among traits.
Discussion
Sorghum, a genus that evolved across diverse environments in Africa, exhibits a wide range of phenotypic diversity (Wright 1931; Doggett 1970; Dillon ). This raises the question of whether sorghum racial groups carry variable deleterious mutation burdens, allowing the mutation consequences to be tested for phenotypic diversity. In this study, we whole-genome resequenced 229 biomass sorghum lines and defined a high-confidence set of putative deleterious mutations using SIFT (< 0.05) and GERP (> 2) scores. All racial groups of sorghum showed variable mutation burdens (ranging from 0.001 to 0.038) that correlated negatively with phenotypic traits. We observed that an average deleterious variant had larger biological effects than an average nondeleterious variant. We further noticed that the prediction ability of the GWP models encompassing deleterious variants are largely trait-dependent.Combining the criteria of SIFT (< 0.05) and GERP (> 2) scores, we first show that sorghum racial groups accumulate appreciable amounts of deleterious mutations in the genome, estimated to be ∼33% of total nonsynonymous substitutions (Figure 1). Although the number and frequency of such mutations within a population largely depends on effective population size, our results match well with previous studies that estimate 20–30% of nonsynonymous variants to be deleterious in several crop species, including model plant species (Lu ; Günther and Schmid 2010; Mezmouk and Ross-Ibarra 2014; Ramu ). Considering highly frequent (DAF > 0.9) mutations, there are 63 nonsynonymous deleterious mutations across racial groups, which are distributed across all chromosomes. These mutations could likely be a combination of variants of important domestication targets, recent pseudogenes, and some truly deleterious variants that are the product of drift (de Alencar Figueiredo ; Smith ).We next estimated an individual mutation burden as the count of derived deleterious alleles carried by an individual divided by the total number of scored (nonmissing) alleles, which differed considerably among individuals and racial groups (Figure 4). It is notable but expected, given that different racial groups have had varying patterns of population dynamics, selection intensities, and domestication histories that could detectably alter the influx of deleterious mutations (Wendorf ; Dillon ; Paterson ). Contrasting deleterious burden has previously been reported in different populations of crop species (Lu ; Renaut and Rieseberg 2015; Ramu ) and humans (Lohmueller ; Fu ; Simons ). Comparatively, the caudatum group appears to have a higher mutation burden than the guinea group, the oldest of the specialized sorghum races (Stemler ; Harlan ). We propose that the higher mutation burden of the caudatum group might be potentially related to the population bottleneck, resulting in a smaller population size that increases the chances of inbreeding, genetic homogeneity, and an increased influx of deleterious mutations (Renaut and Rieseberg 2015; Yang ; Moyers ). On the other hand, a lower mutation burden in the guinea group might be due partly to its higher outcrossing rates, which can reach up to 20% when compared to other races (Barro-Kondombo ; Ranwez ). Therefore, our results suggest that, first, negative selection is less effective at removing weakly deleterious mutations, yielding variable mutation burden among racial groups. Second, the combined effects of a bottleneck and directional selection during domestication (Hamblin ; Lohmueller ) can have an important impact on the deleterious mutation burden even in smaller racial groups of sorghum in which founder events can be more frequent (Charlesworth and Wright 2001; Szövényi ).Although informative, our estimation of mutation burden has some important limitations. First, the deleterious mutations identified in the population were based on the degree of sequence conservation, which is often poorly constructed. Second, our derivation of deleterious mutations does not include noncoding or structural variants, which can contribute substantially to the total load of deleterious mutations (Huang ; Bastarache ). Third, our burden estimation assumes equal fitness effects for all mutations, which is unlikely, as mutations can have different fitness effects that can vary with environments (Henn ). Fourth, we consider the same sign of the effect when estimating the burden, which would be misestimated, as some deleterious mutations may be locally adaptive or neutral (Vikram ; Bastarache ). Nonetheless, despite these caveats, our findings revealed a substantial genomic burden of deleterious mutations in sorghum.We investigated the phenotypic effects of deleterious mutations (Table S1). We found negative correlations between mutation burden and phenotypic traits, suggesting a considerable cost of deleterious mutations on phenotypic traits (Yang ) in a species that has been subjected to recent demographic expansion (Hamblin ). Consistently, we find that an average deleterious variant has demonstrably larger biological effect, which could likely have an important impact contributing to heritable phenotypic variation (Figure 2 and Figure 3). In grasses, it has been previously shown that heritable phenotypic variation can be increased as much as 0.1–1% by new mutations (Sprague ; Houle ; Bataillon 2000). However, the fate of such large-effect mutations on phenotypes is unclear, and whether such mutations are attributable to unconditional deleteriousness or can grant adaptable heritable variation to diverse growing conditions has been actively debated (Glémin and Bataillon 2009). Nonetheless, previous studies have revealed novel variations of genes resulting from postdomestication mutations in sorghum and suggest that neodiversity contributed to new adaptations (de Alencar Figueiredo ; Glémin and Bataillon 2009).Across four traits, we find that putatively deleterious alleles explain roughly one-half of the genetic variance (46–49%), but that there is only a moderate improvement in total heritable variance explained for biomass (7.6%) and PH (3.1%). Additionally, there is no advantage for SLA and TSC (Figure 5 and Figure 6). Such a difference in the contribution of deleterious variants to phenotypic traits was recently observed in maize where dominance contributed substantially to grain yield, while phenology traits appeared to be largely additive (Yang ). Though the effects of mutations being deleterious or compensatory depends greatly upon the genetic background into which that mutation is incorporated (Moyers ), the trivial contributions of mutations to SLA and TSC indicate that such mutations could be either nearly neutral or negatively synergistic. Therefore, our results support the proposition that deleterious mutational effects vary with phenotypic traits and appear to be often larger for fitness-related quantitative traits, while they are unclear for traits that are not directly linked to fitness (Park ). Fitness-related quantitative traits, which are expected to have a more complex genetic architecture, could potentially carry a higher polygenic mutation burden that could considerably affect phenotypes (Purcell ). Also, such expectations are in line with the longstanding understanding that fitness-linked quantitative traits showing directional dominance generally exhibit inbreeding depression (Wright 1931; Kelly 1999; Charlesworth and Charlesworth 1999), which indeed is strongly linked to the degree of deleterious burden in the genome (Mezmouk and Ross-Ibarra 2014).Finally, although our study did not account for sampling error while estimating an individual deleterious variant effect, which is generally greater for rare variants (Jun ), our heritability estimates are consistent with the prediction abilities of phenotypic traits. Therefore, our work adds to ongoing GWP efforts to explore the cumulative effects of deleterious mutations on phenotypic diversity (Yang ; Moyers ). However, since rare deleterious variants are less correlated with each other and their associations greatly suffer from low statistical power (Park ; Auer and Lettre 2015), employing either gene- and/or family-based approaches (Auer and Lettre 2015; Ji ; Jun ), or leveraging the phenotypic patterns (Bastarache ), in which deleterious mutations have detectable phenotypic consequences would assist in examining how rare deleterious mutations shape an individual phenotype.
Conclusions
We used phenotypic and genomic data from different racial groups of sorghum to show that sorghum accumulates an appreciable number of deleterious mutations in the genome. Mutation burden differs substantially among racial groups that negatively correlate with phenotypes. GS models encompassing deleterious mutations show variable predictive ability across traits and, given the relatively high level of population structure in sorghum, disentangling deleterious effects at the single-variant level would take a tremendous amount of effort and recombination. Deleterious variants could be prioritized through work with intermediate phenotypes or with more extensive evolutionary analysis among closely related species. Both of these avenues, if combined with high-throughput genome editing and conventional breeding approaches involving parental lines with fewer deleterious variants, could be used to systematically start removing deleterious variants from elite sorghum lines.
Authors: Carlos D Bustamante; Rasmus Nielsen; Stanley A Sawyer; Kenneth M Olsen; Michael D Purugganan; Daniel L Hartl Journal: Nature Date: 2002-04-04 Impact factor: 49.962
Authors: Tanja Slotte; Khaled M Hazzouri; J Arvid Ågren; Daniel Koenig; Florian Maumus; Ya-Long Guo; Kim Steige; Adrian E Platts; Juan S Escobar; L Killian Newman; Wei Wang; Terezie Mandáková; Emilio Vello; Lisa M Smith; Stefan R Henz; Joshua Steffen; Shohei Takuno; Yaniv Brandvain; Graham Coop; Peter Andolfatto; Tina T Hu; Mathieu Blanchette; Richard M Clark; Hadi Quesneville; Magnus Nordborg; Brandon S Gaut; Martin A Lysak; Jerry Jenkins; Jane Grimwood; Jarrod Chapman; Simon Prochnik; Shengqiang Shu; Daniel Rokhsar; Jeremy Schmutz; Detlef Weigel; Stephen I Wright Journal: Nat Genet Date: 2013-06-09 Impact factor: 38.330
Authors: Lisa Bastarache; Jacob J Hughey; Scott Hebbring; Joy Marlo; Wanke Zhao; Wanting T Ho; Sara L Van Driest; Tracy L McGregor; Jonathan D Mosley; Quinn S Wells; Michael Temple; Andrea H Ramirez; Robert Carroll; Travis Osterman; Todd Edwards; Douglas Ruderfer; Digna R Velez Edwards; Rizwan Hamid; Joy Cogan; Andrew Glazer; Wei-Qi Wei; QiPing Feng; Murray Brilliant; Zhizhuang J Zhao; Nancy J Cox; Dan M Roden; Joshua C Denny Journal: Science Date: 2018-03-16 Impact factor: 47.728
Authors: Eirini Marouli; Mariaelisa Graff; Carolina Medina-Gomez; Ken Sin Lo; Andrew R Wood; Troels R Kjaer; Rebecca S Fine; Yingchang Lu; Claudia Schurmann; Heather M Highland; Sina Rüeger; Gudmar Thorleifsson; Anne E Justice; David Lamparter; Kathleen E Stirrups; Valérie Turcot; Kristin L Young; Thomas W Winkler; Tõnu Esko; Tugce Karaderi; Adam E Locke; Nicholas G D Masca; Maggie C Y Ng; Poorva Mudgal; Manuel A Rivas; Sailaja Vedantam; Anubha Mahajan; Xiuqing Guo; Goncalo Abecasis; Katja K Aben; Linda S Adair; Dewan S Alam; Eva Albrecht; Kristine H Allin; Matthew Allison; Philippe Amouyel; Emil V Appel; Dominique Arveiler; Folkert W Asselbergs; Paul L Auer; Beverley Balkau; Bernhard Banas; Lia E Bang; Marianne Benn; Sven Bergmann; Lawrence F Bielak; Matthias Blüher; Heiner Boeing; Eric Boerwinkle; Carsten A Böger; Lori L Bonnycastle; Jette Bork-Jensen; Michiel L Bots; Erwin P Bottinger; Donald W Bowden; Ivan Brandslund; Gerome Breen; Murray H Brilliant; Linda Broer; Amber A Burt; Adam S Butterworth; David J Carey; Mark J Caulfield; John C Chambers; Daniel I Chasman; Yii-Der Ida Chen; Rajiv Chowdhury; Cramer Christensen; Audrey Y Chu; Massimiliano Cocca; Francis S Collins; James P Cook; Janie Corley; Jordi Corominas Galbany; Amanda J Cox; Gabriel Cuellar-Partida; John Danesh; Gail Davies; Paul I W de Bakker; Gert J de Borst; Simon de Denus; Mark C H de Groot; Renée de Mutsert; Ian J Deary; George Dedoussis; Ellen W Demerath; Anneke I den Hollander; Joe G Dennis; Emanuele Di Angelantonio; Fotios Drenos; Mengmeng Du; Alison M Dunning; Douglas F Easton; Tapani Ebeling; Todd L Edwards; Patrick T Ellinor; Paul Elliott; Evangelos Evangelou; Aliki-Eleni Farmaki; Jessica D Faul; Mary F Feitosa; Shuang Feng; Ele Ferrannini; Marco M Ferrario; Jean Ferrieres; Jose C Florez; Ian Ford; Myriam Fornage; Paul W Franks; Ruth Frikke-Schmidt; Tessel E Galesloot; Wei Gan; Ilaria Gandin; Paolo Gasparini; Vilmantas Giedraitis; Ayush Giri; Giorgia Girotto; Scott D Gordon; Penny Gordon-Larsen; Mathias Gorski; Niels Grarup; Megan L Grove; Vilmundur Gudnason; Stefan Gustafsson; Torben Hansen; Kathleen Mullan Harris; Tamara B Harris; Andrew T Hattersley; Caroline Hayward; Liang He; Iris M Heid; Kauko Heikkilä; Øyvind Helgeland; Jussi Hernesniemi; Alex W Hewitt; Lynne J Hocking; Mette Hollensted; Oddgeir L Holmen; G Kees Hovingh; Joanna M M Howson; Carel B Hoyng; Paul L Huang; Kristian Hveem; M Arfan Ikram; Erik Ingelsson; Anne U Jackson; Jan-Håkan Jansson; Gail P Jarvik; Gorm B Jensen; Min A Jhun; Yucheng Jia; Xuejuan Jiang; Stefan Johansson; Marit E Jørgensen; Torben Jørgensen; Pekka Jousilahti; J Wouter Jukema; Bratati Kahali; René S Kahn; Mika Kähönen; Pia R Kamstrup; Stavroula Kanoni; Jaakko Kaprio; Maria Karaleftheri; Sharon L R Kardia; Fredrik Karpe; Frank Kee; Renske Keeman; Lambertus A Kiemeney; Hidetoshi Kitajima; Kirsten B Kluivers; Thomas Kocher; Pirjo Komulainen; Jukka Kontto; Jaspal S Kooner; Charles Kooperberg; Peter Kovacs; Jennifer Kriebel; Helena Kuivaniemi; Sébastien Küry; Johanna Kuusisto; Martina La Bianca; Markku Laakso; Timo A Lakka; Ethan M Lange; Leslie A Lange; Carl D Langefeld; Claudia Langenberg; Eric B Larson; I-Te Lee; Terho Lehtimäki; Cora E Lewis; Huaixing Li; Jin Li; Ruifang Li-Gao; Honghuang Lin; Li-An Lin; Xu Lin; Lars Lind; Jaana Lindström; Allan Linneberg; Yeheng Liu; Yongmei Liu; Artitaya Lophatananon; Jian'an Luan; Steven A Lubitz; Leo-Pekka Lyytikäinen; David A Mackey; Pamela A F Madden; Alisa K Manning; Satu Männistö; Gaëlle Marenne; Jonathan Marten; Nicholas G Martin; Angela L Mazul; Karina Meidtner; Andres Metspalu; Paul Mitchell; Karen L Mohlke; Dennis O Mook-Kanamori; Anna Morgan; Andrew D Morris; Andrew P Morris; Martina Müller-Nurasyid; Patricia B Munroe; Mike A Nalls; Matthias Nauck; Christopher P Nelson; Matt Neville; Sune F Nielsen; Kjell Nikus; Pål R Njølstad; Børge G Nordestgaard; Ioanna Ntalla; Jeffrey R O'Connel; Heikki Oksa; Loes M Olde Loohuis; Roel A Ophoff; Katharine R Owen; Chris J Packard; Sandosh Padmanabhan; Colin N A Palmer; Gerard Pasterkamp; Aniruddh P Patel; Alison Pattie; Oluf Pedersen; Peggy L Peissig; Gina M Peloso; Craig E Pennell; Markus Perola; James A Perry; John R B Perry; Thomas N Person; Ailith Pirie; Ozren Polasek; Danielle Posthuma; Olli T Raitakari; Asif Rasheed; Rainer Rauramaa; Dermot F Reilly; Alex P Reiner; Frida Renström; Paul M Ridker; John D Rioux; Neil Robertson; Antonietta Robino; Olov Rolandsson; Igor Rudan; Katherine S Ruth; Danish Saleheen; Veikko Salomaa; Nilesh J Samani; Kevin Sandow; Yadav Sapkota; Naveed Sattar; Marjanka K Schmidt; Pamela J Schreiner; Matthias B Schulze; Robert A Scott; Marcelo P Segura-Lepe; Svati Shah; Xueling Sim; Suthesh Sivapalaratnam; Kerrin S Small; Albert Vernon Smith; Jennifer A Smith; Lorraine Southam; Timothy D Spector; Elizabeth K Speliotes; John M Starr; Valgerdur Steinthorsdottir; Heather M Stringham; Michael Stumvoll; Praveen Surendran; Leen M 't Hart; Katherine E Tansey; Jean-Claude Tardif; Kent D Taylor; Alexander Teumer; Deborah J Thompson; Unnur Thorsteinsdottir; Betina H Thuesen; Anke Tönjes; Gerard Tromp; Stella Trompet; Emmanouil Tsafantakis; Jaakko Tuomilehto; Anne Tybjaerg-Hansen; Jonathan P Tyrer; Rudolf Uher; André G Uitterlinden; Sheila Ulivi; Sander W van der Laan; Andries R Van Der Leij; Cornelia M van Duijn; Natasja M van Schoor; Jessica van Setten; Anette Varbo; Tibor V Varga; Rohit Varma; Digna R Velez Edwards; Sita H Vermeulen; Henrik Vestergaard; Veronique Vitart; Thomas F Vogt; Diego Vozzi; Mark Walker; Feijie Wang; Carol A Wang; Shuai Wang; Yiqin Wang; Nicholas J Wareham; Helen R Warren; Jennifer Wessel; Sara M Willems; James G Wilson; Daniel R Witte; Michael O Woods; Ying Wu; Hanieh Yaghootkar; Jie Yao; Pang Yao; Laura M Yerges-Armstrong; Robin Young; Eleftheria Zeggini; Xiaowei Zhan; Weihua Zhang; Jing Hua Zhao; Wei Zhao; Wei Zhao; He Zheng; Wei Zhou; Jerome I Rotter; Michael Boehnke; Sekar Kathiresan; Mark I McCarthy; Cristen J Willer; Kari Stefansson; Ingrid B Borecki; Dajiang J Liu; Kari E North; Nancy L Heard-Costa; Tune H Pers; Cecilia M Lindgren; Claus Oxvig; Zoltán Kutalik; Fernando Rivadeneira; Ruth J F Loos; Timothy M Frayling; Joel N Hirschhorn; Panos Deloukas; Guillaume Lettre Journal: Nature Date: 2017-02-01 Impact factor: 49.962
Authors: Roberto Lozano; Elodie Gazave; Jhonathan P R Dos Santos; Markus G Stetter; Ravi Valluru; Nonoy Bandillo; Samuel B Fernandes; Patrick J Brown; Nadia Shakoor; Todd C Mockler; Elizabeth A Cooper; M Taylor Perkins; Edward S Buckler; Jeffrey Ross-Ibarra; Michael A Gore Journal: Nat Plants Date: 2021-01-15 Impact factor: 15.793
Authors: Karen Massel; Yasmine Lam; Albert C S Wong; Lee T Hickey; Andrew K Borrell; Ian D Godwin Journal: Theor Appl Genet Date: 2021-01-08 Impact factor: 5.699
Authors: Nikhil S Jaikumar; Samantha S Stutz; Samuel B Fernandes; Andrew D B Leakey; Carl J Bernacchi; Patrick J Brown; Stephen P Long Journal: J Exp Bot Date: 2021-06-22 Impact factor: 6.992
Authors: Jhonathan P R Dos Santos; Samuel B Fernandes; Scott McCoy; Roberto Lozano; Patrick J Brown; Andrew D B Leakey; Edward S Buckler; Antonio A F Garcia; Michael A Gore Journal: G3 (Bethesda) Date: 2020-02-06 Impact factor: 3.154
Authors: Xiaoyue Zhang; Samuel B Fernandes; Christopher Kaiser; Pragya Adhikari; Patrick J Brown; Santiago X Mideros; Tiffany M Jamann Journal: BMC Plant Biol Date: 2020-02-10 Impact factor: 4.215