| Literature DB >> 23107476 |
Xuehui Li1, Ananta Acharya, Andrew D Farmer, John A Crow, Arvind K Bharti, Robin S Kramer, Yanling Wei, Yuanhong Han, Jiqing Gou, Gregory D May, Maria J Monteros, E Charles Brummer.
Abstract
BACKGROUND: Alfalfa, a perennial, outcrossing species, is a widely planted forage legume producing highly nutritious biomass. Currently, improvement of cultivated alfalfa mainly relies on recurrent phenotypic selection. Marker assisted breeding strategies can enhance alfalfa improvement efforts, particularly if many genome-wide markers are available. Transcriptome sequencing enables efficient high-throughput discovery of single nucleotide polymorphism (SNP) markers for a complex polyploid species. RESULT: The transcriptomes of 27 alfalfa genotypes, including elite breeding genotypes, parents of mapping populations, and unimproved wild genotypes, were sequenced using an Illumina Genome Analyzer IIx. De novo assembly of quality-filtered 72-bp reads generated 25,183 contigs with a total length of 26.8 Mbp and an average length of 1,065 bp, with an average read depth of 55.9-fold for each genotype. Overall, 21,954 (87.2%) of the 25,183 contigs represented 14,878 unique protein accessions. Gene ontology (GO) analysis suggested that a broad diversity of genes was represented in the resulting sequences. The realignment of individual reads to the contigs enabled the detection of 872,384 SNPs and 31,760 InDels. High resolution melting (HRM) analysis was used to validate 91% of 192 putative SNPs identified by sequencing. Both allelic variants at about 95% of SNP sites identified among five wild, unimproved genotypes are still present in cultivated alfalfa, and all four US breeding programs also contain a high proportion of these SNPs. Thus, little evidence exists among this dataset for loss of significant DNA sequence diversity from either domestication or breeding of alfalfa. Structure analysis indicated that individuals from the subspecies falcata, the diploid subspecies caerulea, and the tetraploid subspecies sativa (cultivated tetraploid alfalfa) were clearly separated.Entities:
Mesh:
Substances:
Year: 2012 PMID: 23107476 PMCID: PMC3533575 DOI: 10.1186/1471-2164-13-568
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Description of the 27 alfalfa genotypes used in this study, their approximate fall dormancy level, and overall sequence statistics
| B75GH-402 | 1 | FGI | 4 | subsp. | 4 | 26.5 | 92.9 | 86.3 |
| B85-912 | 1 | FGI | 4 | subsp. | 4 | 23.9 | 92.8 | 86.0 |
| B85-920 | 1 | FGI | 4 | subsp. | 4 | 29.5 | 92.9 | 86.6 |
| B86-220 | 1 | FGI | 4 | subsp. | 4 | 29.9 | 93.4 | 87.2 |
| CW A-9 | 1 | C/W | 4 | subsp. | 9 | 28.2 | 93.2 | 87.1 |
| CW B-7 | 1 | C/W | 4 | subsp. | 7 | 31.6 | 93.3 | 87.0 |
| CW D-10 | 1 | C/W | 4 | subsp. | 10 | 27.5 | 92.9 | 86.6 |
| CW I-4 | 1 | C/W | 4 | subsp. | 4 | 24.8 | 92.7 | 85.9 |
| CV020017 | 1 | Pioneer | 4 | subsp. | 4 | 28.1 | 92.8 | 86.4 |
| DW000577 | 1 | Pioneer | 4 | subsp. | 4 | 24.5 | 92.9 | 86.3 |
| LH050543 | 1 | Pioneer | 4 | subsp. | 3 | 24.9 | 92.8 | 86.4 |
| NL002724 | 1 | Pioneer | 4 | subsp. | 4 | 23.1 | 93.8 | 88.5 |
| DL317 | 1 | DL | 4 | subsp. | 8 | 28.1 | 92.9 | 86.2 |
| DL833 | 1 | DL | 4 | subsp. | 4 | 30.1 | 93.3 | 87.0 |
| DL879W4 | 1 | DL | 4 | subsp. | 4 | 27.4 | 93.4 | 86.9 |
| DL263 | 1 | DL | 4 | subsp. | 4 | 24.7 | 93.3 | 87.0 |
| 95-608 | 2 | [ | 4 | subsp. | 9 | 28.1 | 92.8 | 86.0 |
| Altet-4 | 2 | [ | 4 | subsp. | 5 | 26.8 | 93.0 | 86.1 |
| NECS-141 | 2 | [ | 4 | subsp. | 4 | 25.8 | 92.3 | 85.6 |
| ABI408 | 2 | [ | 4 | subsp. | 3 | 27.3 | 93.2 | 86.8 |
| Gabès | 2 | INRA,
[ | 4 | subsp. | 10 | 17.2 | 92.7 | 86.8 |
| Magali-A | 2 | INRA,
[ | 4 | subsp. | 6 | 18.8 | 92.3 | 85.2 |
| PI243225-A | 3 | NPGS | 2 | subsp. | unknown | 26.2 | 94.0 | 88.4 |
| PI577551-B | 3 | NPGS | 2 | subsp. | unknown | 19.5 | 92.7 | 86.0 |
| PI631816-A | 3 | NPGS | 2 | subsp. | unknown | 25.1 | 92.6 | 86.6 |
| PI251830-K | 3 | NPGS | 2 | subsp. | unknown | 29.6 | 92.3 | 85.8 |
| WISFAL-6 | 3 | [ | 4 | subsp. | 2 | 26.8 | 93.1 | 86.7 |
†1 = elite cultivated genotypes from commercial USA breeding programs; 2 = non-elite cultivated genotypes; 3 = non-cultivated, wild germplasm.
FGI =Forage Genetics International; C/W = Cal/West Seeds; Pioneer = Pioneer Hi-bred, Inc.; DL = Dairyland, Inc.; INRA = French National Institute for Agricultural Research; NPGS = National Plant Germplasm System, United States Department of Agriculture.
*Fall dormancy based on the scale of 1 = very fall dormant to 11 = very non-dormant [23].
Number and proportion of contigs with homology to gene models (E value of 1× 10) for different contig lengths
| 100-249 bp | 2,046 | 961 (47.0%) |
| 250-499 bp | 5,365 | 3,955 (73.7%) |
| 500-749 bp | 4,318 | 3,804 (88.1%) |
| 750-999 bp | 3,211 | 3,023 (94.1%) |
| ≥1,000 bp | 10,243 | 10,073 (98.3%) |
| Total | 25,183 | 21,816 (86.6%) |
Figure 1An example of high resolution melting analysis to confirm a putative SNP. The number of sequence reads for the two SNP allele variants is given for each of six genotypes and the predicted genotype from sequencing corresponds to the peak resulting from HRM analysis.
Figure 2Venn diagram showing the number of SNP segregating within three groups of alfalfa genotypes. Numbers are based on a total number of SNP equal to 173,947 for which sequence information from all 27 genotypes was available.
Figure 3Numbers of SNP that showed polymorphism within groups of four genotypes derived from four commercial breeding programs. Numbers in the Venn diagram are based on 160,901 SNP polymorphic within 16 elite cultivated genotypes from four commercial breeding programs. FGI = Forage Genetics International; C/W = Cal/West Seeds; Pioneer = Pioneer Hi-bred, Inc.; DL = Dairyland, Inc.
Figure 4Plot of the first two principal components from a principal components analysis of SNP variation among 27 alfalfa genotypes. Blue solid circles represent tetraploid sativa; red solid circle represents tetraploid falcata; blue triangles represent diploid caerulea; red triangles represent diploid falcata.
Figure 5A neighbor-joining phylogenetic tree of 27 alfalfa genotypes based on SNP variation. Red line with a bold label represents tetraploid falcata; red lines with italic labels represent diploid falcata; black lines with bold labels represent tetraploid sativa; black lines with italic labels represent diploid caerulea.