Literature DB >> 35834607

The evolutionary patterns of barley pericentromeric chromosome regions, as shaped by linkage disequilibrium and domestication.

Yun-Yu Chen1,2, Miriam Schreiber1,3, Micha M Bayer1, Ian K Dawson1,4, Peter E Hedley1, Li Lei5, Alina Akhunova5,6, Chaochih Liu5, Kevin P Smith5, Justin C Fay7, Gary J Muehlbauer5, Brian J Steffenson8, Peter L Morrell5, Robbie Waugh1,3, Joanne R Russell1.   

Abstract

The distribution of recombination events along large cereal chromosomes is uneven and is generally restricted to gene-rich telomeric ends. To understand how the lack of recombination affects diversity in the large pericentromeric regions, we analysed deep exome capture data from a final panel of 815 Hordeum vulgare (barley) cultivars, landraces and wild barleys, sampled from across their eco-geographical ranges. We defined and compared variant data across the pericentromeric and non-pericentromeric regions, observing a clear partitioning of diversity both within and between chromosomes and germplasm groups. Dramatically reduced diversity was found in the pericentromeres of both cultivars and landraces when compared with wild barley. We observed a mixture of completely and partially differentiated single-nucleotide polymorphisms (SNPs) between domesticated and wild gene pools, suggesting that domesticated gene pools were derived from multiple wild ancestors. Patterns of genome-wide linkage disequilibrium, haplotype block size and number, and variant frequency within blocks showed clear contrasts among individual chromosomes and between cultivars and wild barleys. Although most cultivar chromosomes shared a single major pericentromeric haplotype, chromosome 7H clearly differentiated the two-row and six-row types associated with different geographical origins. Within the pericentromeric regions we identified 22 387 non-synonymous SNPs, 92 of which were fixed for alternative alleles in cultivar versus wild accessions. Surprisingly, only 29 SNPs found exclusively in the cultivars were predicted to be 'highly deleterious'. Overall, our data reveal an unconventional pericentromeric genetic landscape among distinct barley gene pools, with different evolutionary processes driving domestication and diversification.
© 2022 The Authors. The Plant Journal published by Society for Experimental Biology and John Wiley & Sons Ltd.

Entities:  

Keywords:  zzm321990Hordeum vulgarezzm321990; SNPs; diversity; domestication; evolution; pericentromeric regions

Mesh:

Substances:

Year:  2022        PMID: 35834607      PMCID: PMC9546296          DOI: 10.1111/tpj.15908

Source DB:  PubMed          Journal:  Plant J        ISSN: 0960-7412            Impact factor:   7.091


INTRODUCTION

Continued improvements in crop productivity are critically founded upon the ability of breeders to identify new genotypes that outperform existing varieties when measured against an evolving set of agricultural challenges (Thomas, 2003). Recombination during meiosis is the process that has traditionally driven this, providing a mechanism by which existing parental alleles are shuffled in progeny into new and better combinations that are selected through phenotypic and genotypic screening. Meiotic recombination is typically unevenly distributed across chromosomes, being frequent in telomeric regions and suppressed in pericentromeric areas, which are characterized by high levels of linkage disequilibrium (LD) (Choulet et al., 2014; Gore et al., 2009; Higgins et al., 2014; Wu et al., 2003). In an extreme cereal crop example, all crossovers were observed to occur within the distal 13% of the physical length of chromosome 3B of Triticum aestivum (bread wheat) (Choulet et al., 2014). For plant breeding efforts, extended chromosomal regions with minimal recombination reduce the efficacy of selection (Hill & Robertson 1966), making it more difficult to remove deleterious mutations (Felsenstein, 1974), inhibiting the shuffling of alleles into favourable combinations (Baker et al., 2014) and reducing genetic diversity as a result of background selection (Charlesworth et al., 1993). Given the practical constraints that high levels of LD in pericentromeric areas can impose on crop improvement, much research effort has focused on molecularly dissecting the recombination machinery and using the resulting information to try to develop strategies to modify where and how frequently recombination occurs. In contrast, the evolutionary impacts of the lack of recombination have received only limited research attention, and interactions with other genetic processes, such as domestication, crop diversification and adaptation, remain largely unaddressed. Here, to explore how a lack of regional recombination affects cereal crop genome evolution, we have performed an exhaustive genetic analysis of pericentromeric and non‐pericentromeric regions in the primarily self‐fertilizing crop plant Hordeum vulgare ssp. vulgare (barley), and its wild progenitor, Hordeum vulgare ssp. spontaneum. We chose barley as our model because extensive sequence analysis of formally bred homozygous genotypes (i.e. genotypes that are the end product of selection from directed bi‐ or multi‐parental crosses, hereafter referred to as ‘cultivars’) sampled from across the globe has identified vast tracts of the genome with limited genetic diversity (Mascher et al., 2017; Beier et al., 2017; Bustos‐Korts et al., 2019; Kono et al., 2019). In addition, parallel sequence analysis of extensive collections of wild barley sampled from its natural habitat in the Fertile Crescent and of landraces from across the eco‐geographical expansion range of the crop has been undertaken (Feuillet et al., 2008; Morrell et al., 2014). The assembled knowledge of patterns of genotypic diversity, alongside evidence collected on the founding lineages of the barley crop that suggest a complex history with gene flow and introgression during the expansion of cultivation, provides an informed starting point for our analysis (Morrell & Clegg, 2007; Pankin et al., 2018; Poets et al., 2015; Russell et al., 2016; Saisho & Purugganan, 2007). Estimates indicate that the low‐recombining pericentromeric portion of barley chromosomes is among the largest of the cereal crops, covering around 48% of the physical genome (International Barley Genome Sequencing Consortium et al., 2012; Baker et al., 2014; Beier et al., 2017). During the evolution of the barley crop these pericentromeric regions will have, to a large extent, remained ‘locked’, with limited genetic exchange. We argue that these recombinationally inert expanses provide opportunities to explore the early domestication and diversification history of the crop. Of relevance to our analyses, previous studies of mutational load have not identified a greater proportion of deleterious variants in the pericentromeric regions of the barley chromosome, in contrast to other selfing crops such as Oryza sativa (rice) and Glycine max (soybean). The pericentromeric chromosomal regions of barley may therefore harbour unique features that are particularly worthy of exploration (Kono et al., 2016, 2019; Liu et al., 2017). As defined in the reference genome assembled previously by Mascher et al. (2017), each of the seven chromosomes has been spatially organized into distal (zone 1), interstitial (zone 2) and proximal (zone 3) compartments, based upon the frequencies of repetitive DNA (20 mers) and gene structure. Here, by analysing genome‐wide zonally partitioned variant data derived from exome sequences of a comprehensive panel of cultivar, landrace and wild barleys, we were able to trace the varied evolutionary histories of the pericentromeric regions for all seven barley chromosomes. We found that genetic bottlenecks and limited recombination underlie the unconventional pericentromeric genetic landscape observed in the barley gene pool, with different evolutionary processes in individual chromosomes and sub‐chromosomal zones providing new evidence concerning founder events during domestication and diversification. By characterizing these genome‐scale evolutionary patterns, our data provide an opportunity to comprehensively assess the extent to which the lack of recombination has been (and continues to be) a constraint on barley breeding, while lending further support to the potential value of exploiting barley genetic resources for future crop improvement.

RESULTS AND DISCUSSION

We assembled and analysed a collection of new and existing whole‐exome capture sequence data from an initial panel of 879 accessions of cultivar, landrace and wild barleys sampled from across their eco‐geographical ranges, identifying 93 849 112 variants (Figure S1; Table S1) (Bustos‐Korts et al., 2019; Hübner et al., 2009; Russell et al., 2016; Steffenson et al., 2007). Following variant filtering and the removal of wrongly assigned accessions (Figures S2 and S3), a final data set was generated that comprised 3 082 873 high‐quality single‐nucleotide polymorphisms (SNPs), most of which had a minor allele frequency (MAF) of <0.05 (n = 2 742 309), from a stringently curated and comprehensive set of 815 accessions (163 cultivars, 388 landraces and 264 wild barleys) (Table S2). For an initial check of the overall genetic relationship between these accessions, we conducted principal coordinate analysis (PCO) and inferred admixture using a randomly chosen genome‐wide set of SNPs (Figure 1). A clear division between wild and domesticated barleys was observed (Figure 1a), as had been expected from our prior work on smaller barley panels (e.g. Russell et al., 2016), with seven ‘subpopulations’ identified (Figure 1b) with designations corresponding to the groupings observed in the PCO. As expected from this earlier work, cultivar germplasm appeared to be derived from subsets of landraces, and a split was observed between two‐rowed and six‐rowed accessions.
Figure 1

Population structure of 815 barley accessions. (a) Principle coordinate analysis (PCO) based on 9845 randomly selected single‐nucleotide polymorphisms (SNPs). Samples are colour coded based on domestication status and row type. The proportion of variance explained by the PCOs are labelled beside the axes. The figure was produced with curlywhirly (https://ics.hutton.ac.uk/curlywhirly/). (b) Genetic admixture proportion inferred from faststructure based on the same 9845 SNPs for the PCO analysis. Colour blocks represent different estimated ancestral populations (K = 7). Samples were grouped based on domestication status and row type, as indicated at the black bars below. The figure was produced using structure plot (Ramasamy et al., 2014).

Population structure of 815 barley accessions. (a) Principle coordinate analysis (PCO) based on 9845 randomly selected single‐nucleotide polymorphisms (SNPs). Samples are colour coded based on domestication status and row type. The proportion of variance explained by the PCOs are labelled beside the axes. The figure was produced with curlywhirly (https://ics.hutton.ac.uk/curlywhirly/). (b) Genetic admixture proportion inferred from faststructure based on the same 9845 SNPs for the PCO analysis. Colour blocks represent different estimated ancestral populations (K = 7). Samples were grouped based on domestication status and row type, as indicated at the black bars below. The figure was produced using structure plot (Ramasamy et al., 2014). We then explored different portions of the seven chromosomes of the barley genome. For this purpose, we partitioned each chromosome into three discrete zones using the physical positions reported by Mascher et al. (2017) (Table S4) that were reminiscent of the three compartments applied in an earlier analysis of bread wheat chromosome 3B (Choulet et al., 2014). Zone 1 covers the distal portions of each chromosome, characterized by high gene content and frequent recombination, zone 2 covers the interstitial regions with intermediate gene content and zone 3 approximates the pericentromeric regions, enriched in housekeeping genes with little or no recombination (Keller & Krattinger, 2017). We then generated a range of individual SNP‐ and chromosome‐based diversity‐related analyses for our barley germplasm groups (Figure 2). A clear genomic partitioning pattern between the zones (as defined in Figure 2) was observed, with the pericentromeric regions generally showing reduced genetic diversity (Figure 2a). In particular, the pericentromeric regions of domesticated accessions (cultivars and landraces) in our collection showed dramatically reduced diversity on chromosomes 1H, 2H and 4H, where the genetic diversity (π) values ‘flat‐lined’ (more distal regions not only have higher diversity but the profiles revealed are ‘noisier’). Examining profiles of per‐SNP differentiation (F ST) between pairs of barley groups (Figure 2b–d), we observed distinctive patterns, sometimes including fixed differences, for pericentromeric regions. Intriguingly, F ST values within zone 3 aligned into multiple horizontal ‘tracks’ that comprised long stretches of SNPs with shared F ST values that sometimes extended in both directions into zone 2. The longest track, of approximately 200 Mbp, was located on chromosome 4H. Moreover, multiple ‘break points’ within tracks (creating multiple tracks with different F values) were also observed. Zone‐3 tracks with high F ST values (0.8–1.0) were most noticeable in the cultivar–wild barley comparison (Figure 2b) for chromosomes 1H, 2H, 4H, 5H and 6H, indicating the close to complete, and sometimes complete, fixation of different allelic states between the two gene pools. Some of these large values may be associated with structural variants, as observed in previous studies in Zea mays (maize) and barley (Fang et al., 2012; Fang et al., 2014; Lei et al., 2019), but this was not explicitly tested here. Consistent with their similar π profiles, tracks of high F ST appeared absent from the cultivar–wild barley comparison of zone‐3 areas for chromosomes 3H and 7H. Extending this comparison, in the landrace–wild barley F ST graph (Figure 2c) the horizontal track patterns within zone 3 were maintained, but generally with lower F ST values and with no regions with complete differentiation (F ST = 1). For the cultivar–landrace comparison (Figure 2d), features of the same pattern were retained, but less obviously and with even lower F ST values.
Figure 2

Extensive genetic differentiation in the pericentromeric regions among Hordeum vulgare (barley) groups, showing all single‐nucleotide polymorphisms (SNPs) without minor allele frequency (MAF) filtering. The top track shows the chromosome diagrams, with the gradient of blue colours representing zone 1 (light blue), zone 2 (medium blue) and zone 3 (dark blue) regions, and the red bars representing the centromere, using the coordinates reported by Mascher et al. (2017) and physical distance. (a) Genetic diversity (π): red, wild barleys; orange, landraces; blue, cultivars. (b) Fixation index (F ST) between cultivars and wild barleys. (c) F ST between landraces and wild barleys. (d) F ST between cultivars and landraces. In (b) and (c), sites with F ST ≥ 0.8 were coloured red (with no such sites in panel d).

Extensive genetic differentiation in the pericentromeric regions among Hordeum vulgare (barley) groups, showing all single‐nucleotide polymorphisms (SNPs) without minor allele frequency (MAF) filtering. The top track shows the chromosome diagrams, with the gradient of blue colours representing zone 1 (light blue), zone 2 (medium blue) and zone 3 (dark blue) regions, and the red bars representing the centromere, using the coordinates reported by Mascher et al. (2017) and physical distance. (a) Genetic diversity (π): red, wild barleys; orange, landraces; blue, cultivars. (b) Fixation index (F ST) between cultivars and wild barleys. (c) F ST between landraces and wild barleys. (d) F ST between cultivars and landraces. In (b) and (c), sites with F ST ≥ 0.8 were coloured red (with no such sites in panel d). In the case of the cultivar–wild type comparison, the different F tracks are illustrated schematically for explanation purposes in Figure 3(a–d). The simple case of fixed alternate SNP states in cultivars and wild barleys is shown in Figure 3(a), which could represent an example where an early post‐domestication allele is driven to fixation over the last 10 000 years of cultivation and expansion. Figure 3(b) represents a common run of shared states between the two barley categories (where the shared state in wild barley may indicate its progenitor status). In most of the pericentromeric regions, however, there are a mixture of completely and partly differentiated SNPs, presumably through the presence of multiple ancestral wild haplotypes, resulting in the ‘overlapping’ horizontal tracks of F ST of Figure 3(c). Figure 3(d) shows the situation where a rare recombination event happens between wild barleys, causing a shift of allele frequencies at a chromosomal scale and forming the break points observed, as highlighted for the actual case of barley chromosome 4H in Figure 3(e).
Figure 3

Diagram of how different wild founder haplotypes give rise to horizontal F ST patterns. (a) In the simplest case, single‐nucleotide polymorphisms (SNPs) in cultivars and wild barleys are fixed completely at two different states and a track of F ST = 1 is formed. (b) Horizontal track with a lower F ST value is formed when some wild barleys share the fixed cultivated allele. (c) ‘Overlapping’ horizontal tracks of F ST formed when different wild barley alleles have varying degrees of differentiation from the cultivars. (d) ‘Break point’ variable horizontal tracks of F ST formed that represent rare recombination between two wild barley founder haplotypes. (e) Real exome sequence genotype data from a segment of barley chromosome 4H, zone 3, showing at least three wild barley founder haplotypes, separated by white space, in this region: the ancestors of the cultivars and one possible double crossover event between different wild founders (asterisk).

Diagram of how different wild founder haplotypes give rise to horizontal F ST patterns. (a) In the simplest case, single‐nucleotide polymorphisms (SNPs) in cultivars and wild barleys are fixed completely at two different states and a track of F ST = 1 is formed. (b) Horizontal track with a lower F ST value is formed when some wild barleys share the fixed cultivated allele. (c) ‘Overlapping’ horizontal tracks of F ST formed when different wild barley alleles have varying degrees of differentiation from the cultivars. (d) ‘Break point’ variable horizontal tracks of F ST formed that represent rare recombination between two wild barley founder haplotypes. (e) Real exome sequence genotype data from a segment of barley chromosome 4H, zone 3, showing at least three wild barley founder haplotypes, separated by white space, in this region: the ancestors of the cultivars and one possible double crossover event between different wild founders (asterisk). We next analysed genome‐wide linkage disequilibrium of cultivar, landrace and wild barley groups. Initial examination of genome‐wide average R 2 estimates showed that LD decay in the cultivars was around 1.5× slower overall than in the wild barleys, and about 1.2× slower than in the landraces (Figure S4). Further examination of LD revealed contrasting haplotype block structures between the different germplasm categories (Table 1). The average block size in cultivars was 158 637 kbp, compared with only 26 284 kbp in wild barleys. Although blocks covered over 90% of chromosomes in cultivars, the value was only 50% for the wild barley group, although the wild barley blocks still contained many more SNP variants (almost double, with an average of 46 597 compared with 28 453). Levels of LD and block structure also varied between chromosomes, with 3H and 7H having markedly smaller block sizes in cultivars (80 843 and 89 407 kbp, respectively) than the average, for example. For all germplasm categories, chromosome 4H had comparatively few blocks and the greatest chromosome block coverage (94%).
Table 1

Linkage disequilibrium (LD) haplotype block structure for each group

GroupChr.Chr. length (bp)No. blocksBlock coverage (kb)Chr. block coverage (%)Largest block (kb)No. SNPs in blocks
Cultivars1H558 535 432932505 26990161 87022 405
(n = 163)2H768 075 0241418707 97092184 04333 234
3H699 711 1141161635 7069180 84331 218
4H647 060 158691610 36494258 65220 001
5H670 030 1601351615 47892186 59436 651
6H583 380 5131041542 06993149 05326 062
7H657 224 0001221597 9409189 40729 601
Average1116602 11492158 63728 453
Landraces1H558 535 4321843485 6058774 70831 909
(n = 388)2H768 075 0242746667 27587125 15849 418
3H699 711 1142613611 70587134 60649 199
4H647 060 1581476602 32093185 97034 126
5H670 030 1602457591 25788185 62150 045
6H583 380 5132170508 48687130 28440 095
7H657 224 0002501572 2388776 16646 584
Average2258576 98488130 35943 054
Wild barleys1H558 535 4325769275 00549447641 438
(n = 264)2H768 075 0246835373 40049684752 893
3H699 711 1146686364 7915281 24149 423
4H647 060 1585153392 5996110 41745 920
5H670 030 1606684326 3654955 92749 417
6H583 380 5134588316 7725417 85735 907
7H657 224 0006932306 95847722551 179
Average6092336 5565126 28446 597
Linkage disequilibrium (LD) haplotype block structure for each group We then extended our analysis to explore genes and gene haplotype features by chromosome and chromosome zone (Figure 4; Table S3). The greatest number of haplotypes per gene, accounting for different group sample size, was identified for wild barley (Figure 4a), with the median value of approximately 50 being about five times that of the cultivar group, which had the fewest number of haplotypes per gene. When we compared haplotype richness (randomly selecting 100 accessions for each of the three groups, then calculating the number of haplotypes for these, and repeating this analysis 100 times to generate averages) (Figure 4b), we found that zone 3 always had the lowest values and zone 1 had the highest values, consistent with earlier diversity profiles (Figure 2). Comparing wild and cultivar categories, zone 3 in wild barley had a much higher richness than zone 1 in the cultivar (about double). The frequencies of the major haplotype were higher for cultivars (approx. 60% median value for the major haplotype as a proportion of all haplotypes at each gene) than for landraces and wild barleys (50 and 25%, respectively) (Figure 4c). Corresponding with haplotype richness estimates by chromosome zone (Figure 4b), the dominance of a single haplotype was most prominent in zone 3 of each barley group (Figure 4d). In the cultivars the median frequency value for the major haplotype was over 80% in the zone‐3 area. Data on block sizes (Figure 4e) were consistent with the patterns recorded in Table 1. The difference in block sizes between chromosome zones is much larger for cultivars than for wild barley, with landraces having intermediate differences (Figure 4f). To put these data into a practical context relevant for breeding, the block size observed in the most variable chromosomal region of the cultivars (zone 1) did not significantly differ statistically from that of the least diverse chromosomal region of wild barleys (zone 3) (Table S4).
Figure 4

Gene haplotype analysis for different barley chromosome zones. Haplotypes of 32 222 genes with variants covered by exome sequencing were characterized. (a) Gene haplotype count by chromosome. (b) Gene haplotype count by chromosome zone. (c) Major haplotype frequency by chromosome. (d) Major haplotype frequency by chromosome zone. (e) Block size (bp) by chromosome. (f) Block size (bp) by chromosome zone. Key: blue, cultivars; orange, landraces; red, wild barleys.

Gene haplotype analysis for different barley chromosome zones. Haplotypes of 32 222 genes with variants covered by exome sequencing were characterized. (a) Gene haplotype count by chromosome. (b) Gene haplotype count by chromosome zone. (c) Major haplotype frequency by chromosome. (d) Major haplotype frequency by chromosome zone. (e) Block size (bp) by chromosome. (f) Block size (bp) by chromosome zone. Key: blue, cultivars; orange, landraces; red, wild barleys. These pericentromeric haplotype analyses provided indications of how evolutionary histories have varied among barley chromosomes. To evaluate further the factors involved, for each chromosome we studied the selection signals, structure and gene content of zone 3, compared with other zones. First, we used the μ statistic, which is a composite measure based on site variation, site frequency spectrum and LD profile (Alachiotis & Pavlidis, 2018), to identify potential signals of selective sweeps (Figure 5). For each barley group, we highlighted variants where μ scores were above our 95th percentile threshold, taken to suggest the presence of a selective sweep (Figure 5a). The calculated μ thresholds were 4.56 × 10−5, 1.93 × 10−5 and 1.26 × 10−6 for cultivar, landrace and wild barleys, respectively. Analysis revealed the strongest evidence of selective sweeps in domesticated barleys on chromosome 4H (Figure 5a), although there was no significant difference in average μ scores between chromosomes for any barley group (Figure 5b). For each of the germplasm groups, zone‐3 regions cumulatively showed the highest μ scores and zone‐1 regions the lowest (Figure 5c), suggesting that, overall, pericentromeric regions are subjected to greater positive selection. An unusual feature, however, was the high μ scores found for a non‐pericentromeric region of chromosome 6H in wild barleys (Figure 5a,b). Based on μ values in cultivars, even for zone 1 (lowest average score among zones), the evidence for selective sweeps is many orders of magnitude greater than for zone 3 in wild barleys (highest average score among zones).
Figure 5

Signatures of positive selection in barley differentiated by chromosome and zone. (a) Selective sweep signal (μ) of barley genomes. Red colours represent genomic regions with μ values above the 95th percentile. The top track shows the chromosome diagrams, with the gradient of blue colours representing zone 1 (light blue), zone 2 (medium blue) and zone 3 (dark blue) regions, and the red bars representing the centromere, using the coordinates reported by Mascher et al. (2017). (b) Distribution of μ values by chromosome for different barley groups. (c) μ values by zone (data from all seven chromosomes combined) for different barley groups.

Signatures of positive selection in barley differentiated by chromosome and zone. (a) Selective sweep signal (μ) of barley genomes. Red colours represent genomic regions with μ values above the 95th percentile. The top track shows the chromosome diagrams, with the gradient of blue colours representing zone 1 (light blue), zone 2 (medium blue) and zone 3 (dark blue) regions, and the red bars representing the centromere, using the coordinates reported by Mascher et al. (2017). (b) Distribution of μ values by chromosome for different barley groups. (c) μ values by zone (data from all seven chromosomes combined) for different barley groups. We next assessed the structure of pericentromeric regions by exploring intraspecific relationships among samples for zone‐3 SNPs in each barley chromosome and comparing the results with zone‐1 and ‐2 SNPs combined. The zone‐3‐specific profiles showed the clustering of cultivars and landraces into one to three ‘monophyletic’ clades, separated by clusters of wild barley accessions, and contrasting pictures between chromosome zones and chromosomes (Figure 6a,b, examples of chromosomes 4H and 7H; for the remainder of the chromosomes, see Figure S5). Polytomy, often observed only for cultivar and landrace zone‐3 SNPs, indicated an inability to distinguish these accessions, whereas zone‐3 SNPs on chromosome 7H split domesticated barley into two major clusters associated with different sets of wild barleys (Figure 6b) in a pattern not observed for 4H (other chromosomes except 3H showed a similar pattern to 4H, Figure S5).
Figure 6

Maximum‐likelihood (ML) trees for barley constructed using single‐nucleotide polymorphisms (SNPs) from zones 1 and 2, compared with ML tree constructed using zone‐3 SNPs. (a) Chromosome 4H. (b) Chromosome 7H.

Maximum‐likelihood (ML) trees for barley constructed using single‐nucleotide polymorphisms (SNPs) from zones 1 and 2, compared with ML tree constructed using zone‐3 SNPs. (a) Chromosome 4H. (b) Chromosome 7H. To capture the variation characteristics of zone‐3 ‘phylogenies’ visually, we assigned individuals to simplified ‘haplotype groups’ (haplogroups), which allowed the identification of subgroups of related haplotypes, where the genetic distance between accessions within groups was set at a maximum value of 0.045 according to the methods of Balaban et al. (2019). On this basis, we identified between nine and 21 haplogroups for the zone‐3 region of each chromosome (Figures [Link], [Link]; Table S5). By tracing the haplogroup identity of each accession, parallel plots revealed differences in the sample‐wide diversity profiles of zone 3 between chromosomes for the different groups (Figure 7, each run of connected lines represents a summary of haplotype positions for a barley accession). These profiles show that the vast majority of cultivars share a single zone‐3 haplogroup for each chromosome, except for 7H, with two major groups, one that represented primarily two‐rowed types and the other that represented primarily six‐rowed types (Figure 7b). This split for 7H was mirrored for two‐rowed and six‐rowed landraces (Figure 7c; evident also in Figure 6b). Of the 113 zone‐3 haplogroups identified across all chromosomes and barley categories, 110 were present in wild barleys, with only 34 and 23 present in landraces and cultivars, respectively (Figures [Link], [Link]). Several relatively common haplogroups in wild barley (e.g. 2H, 5H, 6H; Figures [Link], [Link] and S11, respectively) appeared to show a gradient of frequency occurrence across barley categories where landraces had intermediate frequencies higher than cultivars, possibly representing trails of founder events in the development of the modern crop. Summarized counts of haplogroups for cultivars and landraces showed the predominance of single haplogroups for most barley chromosome zone‐3 regions, with this predominance being less pronounced for landraces than for cultivars (Figures [Link], [Link]). Comparing these predominant domesticated zone‐3 haplogroups with wild barley, only in two chromosomes (1H and 4H) were the same haplogroups the most common, whereas for other chromosomes the predominant domesticated haplogroup occurred in less than 10% of wild barleys. In the case of chromosome 7H that showed row‐type‐related zone‐3 haplogroups for domesticated barleys (Figure 7b,c), the two‐row‐ and six‐row‐related haplogroups occurred in 20 and 13% of wild accessions (all wild types are two‐row type), respectively (Figure S12). To explore this further, we plotted the geographical position of the common cultivar haplogroups that were present in wild barley, based on known collection coordinates (Figures [Link], [Link]), observing considerable variation in distribution, depending on chromosome. For both chromosomes 1H and 4H, where all barley categories shared the same most common zone‐3 haplogroup, these were observed across the geographic range of wild barley (Figures S6 and S9). Where the dominant domesticated haplogroup for a zone‐3 region only occurred at low frequency in wild barley, however, geographic distributions – representing the putative ancestral origins of the crop – varied in wild barley by chromosome (Figures [Link], [Link], [Link], [Link] and S12). On chromosome 2H, for example, the most common domesticated haplogroup was present in only six wild barleys restricted to Israel and Jordan (Figure S7), whereas on chromosome 5H the most common domesticated haplogroup was again present in only six wild barleys but, in this case, these were distributed across the Fertile Crescent (Figure S10). The row‐related zone‐3 haplogroups observed in domesticated barley for chromosome 7H showed an interesting geographic distribution in wild barley, with the two‐row‐associated haplogroup restricted to the Fertile Crescent and the six‐row‐associated haplogroup distributed throughout the range (Figure S12).
Figure 7

Pericentromeric genetic diversity in Hordeum vulgare (barley) visualized as haplogroups. Horizontal lines connecting through each chromosome represent barley accessions (colour coded by domestication status and row type). The vertical position of the line at any given chromosome represents the haplogroup number identified for that accession, based on the order presented in Table S5. The four panels show the diversity profile of: (a) all 815 accessions; (b) cultivars; (c) landraces; and (d) wild barleys.

Pericentromeric genetic diversity in Hordeum vulgare (barley) visualized as haplogroups. Horizontal lines connecting through each chromosome represent barley accessions (colour coded by domestication status and row type). The vertical position of the line at any given chromosome represents the haplogroup number identified for that accession, based on the order presented in Table S5. The four panels show the diversity profile of: (a) all 815 accessions; (b) cultivars; (c) landraces; and (d) wild barleys. Domestication bottlenecks and the effects of selection predict reductions in genetic diversity and the accumulation of deleterious alleles in a finite domesticated gene pool (Comeron et al., 2008; Lu et al., 2006; Makino et al., 2018). We were interested to explore whether potential deleterious alleles had, as a result of evident bottlenecks and a lack of recombination, become fixed in the barley crop gene pool. Based on SnpEff annotation (Cingolani et al., 2012), we located 22 387 non‐synonymous SNPs within the zone‐3 region across all tested barley accessions. Zone 3 of chromosome 4H had the highest count of non‐synonymous SNPs, likely linked with being the physically largest such zone as well as the least diverse chromosome in domesticated barley (Table S6). The non‐synonymous zone‐3 SNPs were then filtered based on F ST values of >0.8 in both cultivar–wild barley and landrace–wild barley comparisons (see Figure 2b,c). After filtering, 92 SNPs remained and most were located on chromosomes 2H and 4H, with none in the zone‐3 regions of chromosomes 3H and 7H, probably because chromosomes 3H and 7H have major splits in the pericentromeric haplogroups. The provean (Choi et al., 2012) scores of the 92 SNPs indicated that 29 cultivar alleles had values that were lower than the predefined threshold of −2.5, suggesting a deleterious effect (Table 2). Twenty‐eight of the 29 were missense variants, with a single stop‐loss variant on chromosome 6H. At least three genes that harboured ‘fixed’ deleterious alleles were of potential agricultural interest and are highlighted in Table 2. On chromosome 1H the affected gene was a galactosyltransferase, which could be related to the biosynthesis of arabinoxylan, a cell wall component and a main contributor of dietary fibre (Hassan et al., 2017); on chromosome 2H, the gene annotated as the E3 ubiquitin protein ligase NEURL1B is a candidate associated with grain weight in maize (Zhao & Su, 2019); and on chromosome 6H, an Xaa‐Pro peptidase could relate to the mobilization of barley storage proteins during germination (Davy et al., 2000). The functional implication of these predicted deleterious alleles will require further verification.
Table 2

Potential deleterious alleles fixed in domesticated gene pools

Chr.PositionEffectWild seq.Cultivar seq.Gene affectedTranscript affectedPROVEAN scoreAnnotationMorex v.3 gene ID
1H161 039 495MissenseAsnTyr BART1_0‐u02060 1−3.819GalactosyltransferaseHORVU.MOREX.r3.1HG0031180
1H253 486 741MissenseSerPhe BART1_0‐u02519 1, 2, 3−5.483 to −5.800ABC transporter G family member 24HORVU.MOREX.r3.1HG0038900
1H256 277 577MissenseAlaVal BART1_0‐u02532 1, 3, 4−3n/aHORVU.MOREX.r3.1HG0039050
2H265 057 192MissenseProSer BART1_0‐u10642 11, 31−2.511Pre‐mRNA‐splicing factor ATP‐dependent RNA helicase DEAH7HORVU.MOREX.r3.2HG0142570,HORVU.MOREX.r3.2HG0142550,HORVU.MOREX.r3.2HG0142540 (gene split in Morex v.3)
2H269 489 889MissenseCysArg BART1_0‐u10590 2−3.955n/aHORVU.MOREX.r3.2HG0142940
2H271 533 763MissenseProSer BART1_0‐u10601 1, 2−6.607 to −6.973E3 ubiquitin protein ligase NEURL1BHORVU.MOREX.r3.2HG0143100
2H273 026 038MissenseGluAsp BART1_0‐u10619 4−2.911Peptide‐N(4)‐(N‐acetyl‐β‐glucosaminyl)asparagine amidaseHORVU.MOREX.r3.2HG0143200
2H288 348 617MissenseAspVal BART1_0‐u10701 1−7.99ATP‐dependent DNA helicaseHORVU.MOREX.r3.2HG0144360
2H302 860 598MissenseSerThr BART1_0‐u10798 2, 3−3n/ano hit
2H325 368 183MissenseSerArg BART1_0‐u10900 1, 2−5n/aHORVU.MOREX.r3.2HG0146980
2H327 156 323MissenseHisArg BART1_0‐u10915 1−8n/ano hit
2H342 024 777MissenseCysTyr BART1_0‐u11010 1−10.236Tyrosine‐sulfated glycopeptide receptor 1HORVU.MOREX.r3.2HG0148300
2H352 826 802MissenseLysMet BART1_0‐u11071 1−5.78AUGMIN subunit 3HORVU.MOREX.r3.2HG0149180
2H365 683 330MissenseGlyAsp BART1_0‐u11152 1−6.767n/aHORVU.MOREX.r3.2HG0150130
2H397 248 990MissenseSerThr BART1_0‐u11344 3, 4−3P‐loop containing nucleoside triphosphate hydrolaseHORVU.MOREX.r3.2HG0152460
2H398 383 966MissenseAsnLys BART1_0‐u11335 1−6n/ano hit
4H169 008 802MissenseLysThr BART1_0‐u27962 1, 2−4.900 to −4.933GRAS family transcription factor containing protein, expressedHORVU.MOREX.r3.4HG0357830
4H195 116 684MissenseProLeu BART1_0‐u28149 1−3.439Putative inactive leucine‐rich repeat receptor‐like protein kinaseHORVU.MOREX.r3.4HG0360590
4H237 605 948MissenseThrMet BART1_0‐u28360 1−5.473n/aHORVU.MOREX.r3.4HG0363910
4H337 692 163MissenseArgCys BART1_0‐u28832 1, 2, 5, 6, 9, 10, 12, 15, 16, 17, 18, 19, 20, 21, 22−6.000 to −6.233Rho GTPase activatorHORVU.MOREX.r3.4HG0372920
4H340 149 652MissenseLeuVal BART1_0‐u28824 1, 2, 6, 8, 9−3n/ano hit
4H366 230 980MissenseSerLeu BART1_0‐u29040 11, 12, 20−2.545 to −2.975β‐Adaptin‐like protein CHORVU.MOREX.r3.4HG0374910
5H169 096 533MissenseThrIle BART1_0‐u34231 18−6Ureide permease 1‐like isoform X2HORVU.MOREX.r3.5HG0448790,HORVU.MOREX.r3.5HG0448780 (gene split in Morex v.3)
5H200 493 783MissenseSerTyr BART1_0‐u34352 1−6n/ano hit
5H207 656 318MissenseThrIle BART1_0‐u34384 1−6n/aHORVU.MOREX.r3.5HG0451070
5H261 369 954MissenseGlyAla BART1_0‐u34706 1−4.628tRNA (guanine(37)‐N1)‐methyltransferaseHORVU.MOREX.r3.5HG0455140
6H231 545 723MissenseThrAla BART1_0‐u44549 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 13−2.868 to −3.139Xaa‐Pro dipeptidaseHORVU.MOREX.r3.6HG0581810
6H238 482 883Stop lostSTOPTrp BART1_0‐u44657 4, 6, 9, 12, 18, 23, 25, 34, 61−3.011 to −3.292Probable magnesium transporterHORVU.MOREX.r3.6HG0582120
6H291 363 394MissenseThrAla BART1_0‐u44837 4, 5, 6−2.682Vesicle transport proteinHORVU.MOREX.r3.6HG0586950,HORVU.MOREX.r3.6HG0586940 (gene split in Morex v.3)
Potential deleterious alleles fixed in domesticated gene pools Finally, we examined the function of genes within zone 3 to determine any over‐representation of Gene Ontology (GO) terms (Table S7, with known agriculturally important genes highlighted). When analysis was performed on combined zone‐3 gene sets compared with all genes (for all seven chromosomes), GO terms with housekeeping functions were enriched, such as nucleic acid binding, DNA integration and RNA‐dependent DNA biosynthetic processes (Figure S13), as had previously been observed by Mascher et al. (2017). When our analysis was performed individually for chromosome zone‐3 genes, varying GO terms were enriched (Figures S13 and S14). For example, pollen wall development was only found to be enriched for zone 3 of chromosome 1H, whereas root developmental genes (root morphogenesis and root hair tip) were over‐represented for zone‐3 regions of chromosomes 2H and 3H. For chromosomes 4H and 5H, zone‐3 regions were enriched with plastid‐related GO terms, including chloroplast organization, chloroplast fission and plastid translation. Zone 3 of chromosome 4H, which showed distinctive selective sweep signals in cultivars, also had translation‐related terms over‐represented, such as translational termination, translation release factor and mRNA splicing. It would be reasonable to speculate that human selection has been imposed on the variation that influences some of these biological processes. Still, more study is required to identify any beneficial alleles that are under selection. In the case of chloroplast‐related genes, it may be that the nuclear chloroplast gene‐related allelic composition has led to the selection or stochastic sampling of distinct chloroplast lineages during crop domestication and diversification (Molina‐Cano et al., 2005).

CONCLUSION

Apart from revealing further details about the complex history of domesticated barley, our pericentromeric versus non‐pericentromeric chromosomal comparisons have important practical applications. Modern, resilient barley production that ensures sustainable future harvests, in the light of challenges such as climate change (Dawson et al., 2015) and the need for greater resource‐use efficiency (Cope et al., 2020), requires the recovery and exploitation of lost subsistence farming‐derived (landrace) and naturally evolved (wild) traits through broad genomic access (Bailey‐Serres et al., 2019). This is, however, restricted in the low‐recombining pericentromeric regions of barley and other large genome cereals. Novel methods are being developed to alter the frequency and distribution of recombination and speed up the breeding process through the CRISPR/Cas9 manipulation of pro‐ and anti‐crossover (CO) genes, site‐directed nucleases and/or epigenetic modifiers, among others (Taagen et al., 2020). However, their overall effectiveness in the context of crop improvement, including their potential for introducing deleterious unintended effects (e.g. increased mutation frequency or genome instability), remains to be assessed. Here, by using a large panel of cultivar, landrace and wild barleys, and chromosome zone‐specific DNA sequence information, we have revealed in detail the extent to which the lack of recombination in pericentromeric regions has, and will likely continue to, constrain progress in barley breeding. Based on the measure of haplotype block size, we show that even the most recombination‐accessible region of the cultivated barley genome (zone 1) has only around the same accessibility as the least recombination‐accessible part of the wild barley genome (zone 3). Calculations of selective sweeps further indicate the consequences of linkage drag in cultivars, with the most accessible part of the barley cultivar genome having, overall, significantly higher selection scores than the least accessible genomic region of wild barley.

EXPERIMENTAL PROCEDURES

Sample selection, library preparation and exome sequencing

The germplasm chosen for this study is described in Table S1. Data on the majority of cultivars (163) and landraces (259) in the starting panel were sourced from the European project Wheat and Barley Legacy for Breeding Improvement (WHEALBI), with the domestication status of accessions as described by Bustos‐Korts et al. (2019). Other landraces (129) included in our initial panel were described by Russell et al. (2016) (known as ‘EXCAP’ accessions). Data on wild barley accessions were obtained from several sources: for 98 accessions from EXCAP (Russell et al., 2016); for 75 accessions from Barley B1K (Hübner et al., 2009); for 32 accessions from WHEALBI; for parents of a nested association mapping (NAM) population from Herzig et al. (2019); and for 61 accessions from the Wild Barley Diversity Collection (WBDC) (Steffenson et al., 2007). Library preparation and exome sequencing were described previously by Bustos‐Korts et al. (2019) and Russell et al. (2016).

Reads mapping and variant calling, filtering and annotation

All sequence data were from paired‐end Illumina sequencing (https://emea.illumina.com). Sequence lengths varied between 100 and 125 bp, depending on the source data set. Quality control of the raw data was carried out using fastqc (Andrews, 2010). We followed the Genome Analysis Toolkit (gatk) Best Practices (Van der Auwera et al., 2013) for read mapping, BAM file pre‐processing and variant calling. For the latter two steps, gatk 3.4.0 was used. The gatk Best Practices guidelines recommend the mapping of raw reads to enable the accurate deduplication of paired‐end read mappings. Consequently, no read trimming was carried out prior to mapping. In this scenario, read errors and adapter sequences are flagged up by the mapping tool through soft‐clipping and are disregarded during downstream analysis. bwa‐mem (Li, 2013) was used to separately map the raw reads from each barley line to the Morex 2017 reference genome (Mascher et al., 2017), with a comparatively strict mismatch rate of 4% applied to minimize the mis‐mapping of reads to location and the consequent calling of false‐positive variants (Ribeiro et al., 2015). In accordance with gatk Best Practices, the primary read mappings were then deduplicated using samtools rmdup (Li & Durbin, 2009) to remove both optical and PCR duplicates. In the next step, indel realignment was carried out with the gatk indelrealigner tool and the resulting BAM file was used to produce an initial set of variants with the haplotypecaller tool. These variants were then filtered (QUAL > 20) with vcflib (https://github.com/vcflib/vcflib) and used as known sites for the base quality score recalibration. A second run of the haplotypecaller was used to produce a final GVCF file for each barley line, and this was the basis for joint genotype calling. Individual GVCF files were batched into cohorts of size 20 or fewer using the gatk combinegvcfs tool. Cohort files were then processed using the gatk genotypegvcfs tool to produce the final variant calls. Mappings and variants were visually spot‐checked using the tablet assembly viewer tool (Milne et al., 2013). To produce a robust set of variants for downstream analysis, we filtered the initial set of variants using custom java code. The objective was to create a set of variants with a minimum of missing genotype calls and a minimum of false‐positive variant calls, but with sufficient coverage of the genome. For a variant to be retained it had to pass the following filtering criteria. Read depth of ≥8 in at least 50% of the samples (removes variants with low read depth) <5% of samples with missing genotype calls (maximizes sample representation) At least one homozygous sample with the minor allele as its genotype (removes variants based on one or more heterozygous samples only) SNP QUAL score of >30 (removes low‐confidence variants) <2% of samples being heterozygous (removes false‐positive variants caused by mis‐mapping) Number of alleles = 2 Variant type is not insertion or deletion or multi‐nucleotide polymorphism The variants were then functionally annotated using snpeff (Cingolani et al., 2012), using the barley reference transcript data set BART 1.0 (Rapazote‐Flores et al., 2019) as the basis for predictions.

Comparison of on/off‐target variants and rare/non‐rare variants

To allow a comparative analysis of variants that were on/off target with regards to the exome capture probes, the exome capture design file was obtained from the Nimblegen website (https://sftp.rch.cm/diagnostics/sequencing/nimblegen_annotations/ez_barley_exome/barley_exome.zip) and the capture probe sequences were mapped to the Morex 2017 reference genome using blastn (Altschul et al., 1990; Camacho et al., 2009), with an e‐value cut‐off of 1e‐10 and a minimum percentage identity of >90. The bedtools intersect method (Quinlan & Hall, 2010) was then used to compute the overlap between the filtered variants and the mapping positions of the exome capture probes, and variants overlapping the probes were classified as on target, whereas the remainder were classified as off target. Read depth and variant quality scores were then extracted from the VCF file using vcftools (Danecek et al., 2011). ‘Rare’ SNPs were defined as those with an MAF of <0.05. The averaged genotype quality score (GQ) was extracted for rare and non‐rare SNPs from the VCF file using vcftools (Danecek et al., 2011). To compare GQ between major and minor alleles, the values for each called position were extracted across accessions using vcftools and grouped into major and minor alleles using a custom python script for distribution plot.

Genome‐wide relatedness and ordination

A target of 10 000 SNPs (n = 9845) were randomly selected from the filtered variant data set using selectvariants in gatk for the reconstruction of genome‐wide relatedness and PCO. The PCO was performed using past 3.25 (Hammer et al., 2001) and the result visualized by curlywhirly 1.19.03 (https://ics.hutton.ac.uk/curlywhirly/).

Barley genetic landscape

Genetic diversity (π) and pairwise F ST values for SNPs were calculated using ‘‐site‐pi’ and ‘‐weir‐fst‐pop’, respectively, in vcftools. The π values were plotted using a moving average method with a window size of 10 000 bp, whereas the F ST values were plotted on a per‐site basis so that the fine‐scale horizontal track patterns in pericentromeric regions could be observed. The zone‐3 genotype heat map was visualized with flapjack (Milne et al., 2010), with SNPs having MAFs of <0.05 being excluded to reduce noise, without altering the overall genetic variation pattern. The LD haplotype blocks were estimated using the ‘‐blocks’ function in plink 1.904 (Purcell et al., 2007), under default settings, following the block definition method mentioned in Gabriel et al. (2002), except that the limitation of block size was increased to allow large blocks that could potentially cover whole chromosomes (the ‘‐blocks‐max‐kb’ parameter was set to 800 000 kbp). A similar approach had been used previously in wheat (Hao et al., 2017). The LD decay profiles (R 2 vs distance) were calculated based on a thinned SNP data set (thinned using the ‘‐thin’ function in vcftools), to keep only SNPs with at least a 10 000‐bp interval distance. The thinned data were used for LD estimation via the plink ‘‐r2’ function, with options applied to allow the calculation of R 2 for all pairwise SNPs within a given window size of 15 000 kbp (−ld‐window 100 000 ‐ld‐window‐kb 15 000), with R 2 values above 0.05 being reported. Distance information used for the final visualization was taken from the plink LD output file (BP_B – BP_A). Haplotype counts for chromosomes and chromosome zones were corrected estimates accounting for the different sample sizes of cultivar, landrace and wild barley categories. For each category, counts were based on randomly selected samples of 100 accessions. The randomization procedure was performed 100 times and average values were used. We applied this sample size correction specifically to haplotype richness estimates because of the potential high sensitivity of this parameter to sample size (when there are a large number of different haplotype states), which is not the case for individual SNP‐based (i.e. biallelic) diversity estimates such as π. Signatures of selective sweeps were detected using raisd 2.4 (Alachiotis & Pavlidis, 2018), with the option to impute missing data (−M 1). The 95th percentile of μ was calculated for each population and used as the threshold to highlight outlier SNPs. All plotting was performed with r 3.6.0 and moving averages calculated using the ‘roll.apply’ function of zoo 1.8‐8 (Zeileis & Grothendieck, 2005). The chromosome containing unmapped contigs (chrUn) was excluded from all analyses.

Zone‐3 evolution comparison

We followed the zone‐3 coordinates reported in the Morex 2017 reference genome paper (Mascher et al., 2017) and separated SNPs based on the coordinates for each chromosome. The ‘phylogenies’ and PCO analyses were performed as described in a previous section. For the intraspecific ‘phylogenetic’ relatedness analysis, the VCF file was first converted to PHYLIP format using vcf2phylip.py 2.0 (https://github.com/edgardomortiz/vcf2phylip). The GTR + G4 model was then selected under the Akaike information criterion (AIC) calculated via modeltest‐ng 0.1.6 (Darriba et al., 2020), and the unrooted ML tree was estimated using raxml‐ng 0.6.0 (Kozlov et al., 2019). Trees were visualized using the interactive Tree Of Life (iTOL) web server (Letunic & Bork, 2019).

Identification of BaRTv.1 homologues in Morex v.3

BART1 homologues in the Morex v.3 reference assembly (Mascher et al., 2021) were identified with blastp (Altschul et al., 1990) using BART1 proteins as queries and Morex v.3 proteins as subjects. Raw hits were sorted by percentage identity (descending) and query coverage per high‐scoring segment pair (HSP) (descending) and then filtered by percentage identity (≥98%). This leaves the best hit topmost but still retains multiple transcripts for each query. We then removed duplicates by query gene and subject gene to leave the best hit for a given query–subject gene combination, while still allowing for split/fused genes. Some BART1 genes had no hits in Morex v.3 with the above approach, whereas others had multiple hits, presumably with genes having been fused or collapsed in BART1.

CONFLICT OF INTEREST

The authors declare that they have no conflicts of interest associated with this work.

AUTHOR CONTRIBUTIONS

YYC carried out the statistical and genetic analysis and drafted the first version of the article. MS, MMB and PEH assembled the exome capture data and performed variant calling, filtering and annotation. IKD contributed to genetic interpretation and writing the article. LL contributed to the evolutionary interpretation and editing of final version for publication. AA, KPS and JCF generated exome capture data from a section of wild barley lines (Table S1, WBDC). GM and BJS collected and assembled the WBDC collection. PLM contributed to the genetic and evolutionary interpretation and drafting the article. RW conceived the project and assembled the collaborators. JR conceived part of the project and contributed to the interpretation and writing of the article. Figure S1. Geographical distribution of the genotyped barley germplasm. Click here for additional data file. Figure S2. Comparison between rare (minor allele frequency, MAF < 0.05; n = 2 742 309) single‐nucleotide polymorphisms (SNPs) and other (n = 340 564) SNPs. Click here for additional data file. Figure S3. Comparison between on‐target (n = 1 736 337) and off‐target (n = 1 346 536) single‐nucleotide polymorphisms (SNPs). Click here for additional data file. Figure S4. Extent of linkage disequilibrium by groups (γ2). Click here for additional data file. Figure S5. Principal component analysis (PCA) plot of zone‐3 regions, and comparison of maximum‐likelihood (ML) phylogenies derived from zone‐1 + zone‐2 regions with that derived from zone‐3 regions. Click here for additional data file. Figure S6. Genetic diversity in chr1H pericentromeric regions. Click here for additional data file. Figure S7. Genetic diversity in chr2H pericentromeric regions. Click here for additional data file. Figure S8. Genetic diversity in chr3H pericentromeric regions. Click here for additional data file. Figure S9. Genetic diversity in chr4H pericentromeric regions. Click here for additional data file. Figure S10. Genetic diversity in chr5H pericentromeric regions. Click here for additional data file. Figure S11. Genetic diversity in chr6H pericentromeric regions. Click here for additional data file. Figure S12. Genetic diversity in chr7H pericentromeric regions. Click here for additional data file. Figure S13. Word clouds for the Gene Ontology (GO) enrichment results for zone‐3 genes. Click here for additional data file. Figure S14. Gene Ontology (GO) terms in zone 3 of each chromosome. Click here for additional data file. Table S1. Information of 879 exome sequence Hordeum vulgare (barley) accessions. Table S2. The filtered single‐nucleotide polymorphism (SNP) set. Table S3. Number of genes covered in exome capture sequencing. Table S4. The result from non‐parametric analysis of variance (Kruskal–Wallis H‐test) suggests at least one of the group shows significant difference in block size among all nine groups tested (P < 0.01). Table S5. Haplotype grouping (haplogroup) for each accession. Table S6. Summary of non‐synonymous alleles in zone 3. Table S7. Agriculturally important known Hordeum vulgare (barley) genes. Click here for additional data file.
  60 in total

1.  A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3.

Authors:  Pablo Cingolani; Adrian Platts; Le Lily Wang; Melissa Coon; Tung Nguyen; Luan Wang; Susan J Land; Xiangyi Lu; Douglas M Ruden
Journal:  Fly (Austin)       Date:  2012 Apr-Jun       Impact factor: 2.160

2.  Exome sequencing of geographically diverse barley landraces and wild relatives gives insights into environmental adaptation.

Authors:  Joanne Russell; Martin Mascher; Ian K Dawson; Stylianos Kyriakidis; Cristiane Calixto; Fabian Freund; Micha Bayer; Iain Milne; Tony Marshall-Griffiths; Shane Heinen; Anna Hofstad; Rajiv Sharma; Axel Himmelbach; Manuela Knauft; Maarten van Zonneveld; John W S Brown; Karl Schmid; Benjamin Kilian; Gary J Muehlbauer; Nils Stein; Robbie Waugh
Journal:  Nat Genet       Date:  2016-07-18       Impact factor: 38.330

Review 3.  Genetic strategies for improving crop yields.

Authors:  Julia Bailey-Serres; Jane E Parker; Elizabeth A Ainsworth; Giles E D Oldroyd; Julian I Schroeder
Journal:  Nature       Date:  2019-11-06       Impact factor: 49.962

4.  The effect of linkage on limits to artificial selection.

Authors:  W G Hill; A Robertson
Journal:  Genet Res       Date:  1966-12       Impact factor: 1.588

5.  Genetic evidence for a second domestication of barley (Hordeum vulgare) east of the Fertile Crescent.

Authors:  Peter L Morrell; Michael T Clegg
Journal:  Proc Natl Acad Sci U S A       Date:  2007-02-21       Impact factor: 11.205

Review 6.  Factors underlying restricted crossover localization in barley meiosis.

Authors:  James D Higgins; Kim Osman; Gareth H Jones; F Chris H Franklin
Journal:  Annu Rev Genet       Date:  2014-08-01       Impact factor: 16.830

7.  STRUCTURE PLOT: a program for drawing elegant STRUCTURE bar plots in user friendly interface.

Authors:  Ramesh Krishnan Ramasamy; Sumathy Ramasamy; Bharat Bushan Bindroo; V Girish Naik
Journal:  Springerplus       Date:  2014-08-13

8.  The Role of Deleterious Substitutions in Crop Genomes.

Authors:  Thomas J Y Kono; Fengli Fu; Mohsen Mohammadi; Paul J Hoffman; Chaochih Liu; Robert M Stupar; Kevin P Smith; Peter Tiffin; Justin C Fay; Peter L Morrell
Journal:  Mol Biol Evol       Date:  2016-06-14       Impact factor: 16.240

9.  A Genome Wide Association Study of arabinoxylan content in 2-row spring barley grain.

Authors:  Ali Saleh Hassan; Kelly Houston; Jelle Lahnstein; Neil Shirley; Julian G Schwerdt; Michael J Gidley; Robbie Waugh; Alan Little; Rachel A Burton
Journal:  PLoS One       Date:  2017-08-03       Impact factor: 3.240

10.  Fast and accurate short read alignment with Burrows-Wheeler transform.

Authors:  Heng Li; Richard Durbin
Journal:  Bioinformatics       Date:  2009-05-18       Impact factor: 6.937

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.