| Literature DB >> 23889683 |
Martin Mascher1, Todd A Richmond, Daniel J Gerhardt, Axel Himmelbach, Leah Clissold, Dharanya Sampath, Sarah Ayling, Burkhard Steuernagel, Matthias Pfeifer, Mark D'Ascenzo, Eduard D Akhunov, Pete E Hedley, Ana M Gonzales, Peter L Morrell, Benjamin Kilian, Frank R Blattner, Uwe Scholz, Klaus F X Mayer, Andrew J Flavell, Gary J Muehlbauer, Robbie Waugh, Jeffrey A Jeddeloh, Nils Stein.
Abstract
Advanced resources for genome-assisted research in barley (Hordeum vulgare) including a whole-genome shotgun assembly and an integrated physical map have recently become available. These have made possible studies that aim to assess genetic diversity or to isolate single genes by whole-genome resequencing and in silico variant detection. However such an approach remains expensive given the 5 Gb size of the barley genome. Targeted sequencing of the mRNA-coding exome reduces barley genomic complexity more than 50-fold, thus dramatically reducing this heavy sequencing and analysis load. We have developed and employed an in-solution hybridization-based sequence capture platform to selectively enrich for a 61.6 megabase coding sequence target that includes predicted genes from the genome assembly of the cultivar Morex as well as publicly available full-length cDNAs and de novo assembled RNA-Seq consensus sequence contigs. The platform provides a highly specific capture with substantial and reproducible enrichment of targeted exons, both for cultivated barley and related species. We show that this exome capture platform provides a clear path towards a broader and deeper understanding of the natural variation residing in the mRNA-coding part of the barley genome and will thus constitute a valuable resource for applications such as mapping-by-sequencing and genetic diversity analyzes.Entities:
Keywords: Hordeum bulbosum; Hordeum pubiflorum; Hordeum vulgare; Triticeae; barley; genetic diversity; genomics; targeted resequencing
Mesh:
Year: 2013 PMID: 23889683 PMCID: PMC4241023 DOI: 10.1111/tpj.12294
Source DB: PubMed Journal: Plant J ISSN: 0960-7412 Impact factor: 6.417
Sequence input for capture design
| Probe set | No. of sequences | Length (bp) | Predicted coverage (%) | Source |
|---|---|---|---|---|
| Cufflinks predictions | 155 863 | 56 418 032 | 98.5 | |
| Full-length cDNA | 35 297 | 14 707 427 | 97.0 | |
| RNA-Seq contigs | 108 759 | 30 663 579 | 97.9 | This study |
| Total | 300 919 | 101 789 038 | 97.8 |
Number of sequence contigs after filtering for repetitive sequences.
Predicted sequence capture coverage extending capture targets by 100 bp on each side.
Barley cultivars and wild relatives included in this study
| Species | Cultivar/accession | No. of captures |
|---|---|---|
| Barke | 11 | |
| Borwina | 1 | |
| Bonus | 1 | |
| Bowman | 3 | |
| Foma | 1 | |
| Gull | 1 | |
| Harrington | 1 | |
| Haruna Nijo | 1 | |
| Igri | 1 | |
| Kindred | 1 | |
| Morex | 11 | |
| Steptoe | 2 | |
| Vogelsanger Gold | 1 | |
| B1K-04-12 | 1 | |
| OUH602 | 1 | |
| B1K-03-07 | 1 | |
| A42 (autotetraploid) | 1 | |
| BCC 2061 (diploid) | 1 | |
| 2940/4 (diploid) | 1 | |
| BCC 2028 | 1 | |
| Chinese Spring | 1 |
Figure 1Target coverage in cultivars and related species.The percentage of target regions with at least 10-fold coverage (a), and the median coverage (b), of target regions are plotted as function of raw sequencing output. Different symbols are used for samples from different species. The legend is given in (b) for both panels. Regressions lines were obtained fitting the model log(1-y)∼log(x) (a), or a linear model (b), to the data points of Hordeum vulgare.
Figure 2Target coverage in library combinations.The median per-base coverage in target regions is depicted cultivars Morex (green), Barke (orange), Bowman (blue) and Steptoe (gray). Libraries from these cultivars were combined at different molar ratios. The first group of bars corresponds to an experiment in which libraries from Morex and Barke were combined after hybridization. Otherwise, libraries were hybridized to the capture assay in combination. The molar ratios of the combinations are given below the bars. The observed ratios between the median target coverage are printed above the bars.
Figure 3Coverage across 13 cultivars.The size of sequence intervals on the barley WGS assembly covered by at least 1, 2, 5, 10, 20 or 50 reads in all captured samples from a set of 13 cultivars. All bases positions (black dots) or only positions in target regions (red dots) were considered.
Figure 4Number of detected single-nucleotide polymorphisms (SNPs).The number of SNPs detected between Morex and samples from other barley cultivars and related species across the genome (a) or only in target regions (b) is plotted as a function of sequencing depth. The legend for both panels is shown in (a). Regression lines were obtained by fitting a linear model to Hordeum vulgare data points.
Figure 5Frequency of single-nucleotide polymorphisms (SNPs) in exons along the barley chromosomes.The SNP frequency in 5 Mb windows along the physical length of the seven barley chromosomes is plotted for 12 cultivars, three accessions from species in the genus Hordeum and bread wheat cv. Chinese Spring. SNP calling was performed against a reference sequence from barley cultivar ‘Morex’. The positions of contigs were taken from the barley physical framework (IBSC, 2012).
Figure 6Neighbor-joining tree.The tree depicts the single-nucleotide polymorphism (SNP) distance between 13 barley cultivars and three Hordeum spontaneum accessions as inferred from genotype calls at 122 940 bi-allelic SNPs with no missing data. A tree including H. bulbosum and H. pubiflorum accessions is shown in Figure S3.