| Literature DB >> 17662031 |
W Brad Barbazuk1, Scott J Emrich, Hsin D Chen, Li Li, Patrick S Schnable.
Abstract
A massively parallel pyro-sequencing technology commercialized by 454 Life Sciences Corporation was used to sequence the transcriptomes of shoot apical meristems isolated from two inbred lines of maize using laser capture microdissection (LCM). A computational pipeline that uses the POLYBAYES polymorphism detection system was adapted for 454 ESTs and used to detect SNPs (single nucleotide polymorphisms) between the two inbred lines. Putative SNPs were computationally identified using 260,000 and 280,000 454 ESTs from the B73 and Mo17 inbred lines, respectively. Over 36,000 putative SNPs were detected within 9980 unique B73 genomic anchor sequences (MAGIs). Stringent post-processing reduced this number to > 7000 putative SNPs. Over 85% (94/110) of a sample of these putative SNPs were successfully validated by Sanger sequencing. Based on this validation rate, this pilot experiment conservatively identified > 4900 valid SNPs within > 2400 maize genes. These results demonstrate that 454-based transcriptome sequencing is an excellent method for the high-throughput acquisition of gene-associated SNPs.Entities:
Mesh:
Year: 2007 PMID: 17662031 PMCID: PMC2169515 DOI: 10.1111/j.1365-313X.2007.03193.x
Source DB: PubMed Journal: Plant J ISSN: 0960-7412 Impact factor: 6.417
Summary of multiple sequence alignments (MSAs) between MAGI 3.1 anchors and B73 and Mo17 454 ESTs
| Types of ESTs in MSAs | ||||||
|---|---|---|---|---|---|---|
| All ESTs | All B73 ESTs | All Mo17 ESTs | Both B73 and Mo17 ESTs | Only B73 ESTs | Only Mo17 ESTs | |
| Number of MAGIs aligned | 48 063 | 33 567 | 34 928 | 20 432 | 13 135 | 14 496 |
| Bases covered | 8 897 508 | 4 989 045 | 5 798 933 | 1 890 459 | 3 098 586 | 3 908 463 |
| Coverage depth | 1.8 x | 2.3 x | 2.3 x | 8.4 x | 1.3 x | 1.3 x |
For this analysis, 454 sequences were initially mapped to individual MAGIs using BLAST, which later served as the template on which these MSA were computed using CROSS_MATCH (Experimental procedures). Coverage data are presented for all alignments, as well as alignments between individual subsets of ESTs.
Average coverage of nucleotide sites represented within B73 454 ESTs, Mo17 454 ESTs and MAGI 3.1 anchored multiple sequence alignments
| 454 EST component depths | Number of nucleotides | Average coverage | ||
|---|---|---|---|---|
| Mo17 | B73 | 1 092 570 | 3.2 | |
| 1 x | ≥1 x | |||
| or | ||||
| ≥ 1 x | 1 x | 326 095 | 5.9 | |
| 2 x | ≥ 2 x | |||
| or | ||||
| 2 x | ≥ 2 x | |||
| ≥ 3 x | 2 x | 134 386 | 6.7 | |
| ≥ 3 x | ≥ 3 x | 471 794 | 22 | |
Although the alignment of a single Mo17 EST to a B73-derived MAGI is sufficient to predict a SNP, increased sampling depth is expected to increase the accuracy of SNP calling by filtering out sequencing errors. Depth classes that are grouped together were pooled for analysis.
Number of putative SNPs, depth at each SNP site by inbred line, and estimates of the total number of maize genes that contain at least one putative SNP between the B73 and Mo17 inbred lines in this SNP dataset
| 454 EST component depths of MSAs | ||||||
|---|---|---|---|---|---|---|
| Mo17 | B73 | Number of putative SNPs | Number of MAGI 3.1 anchors | Additive SNP number | Additive minimum estimate of SNP-containing genes | |
| 1 x | 1 x | 1762 | 1154 | |||
| or | ||||||
| 1 x | 0 | |||||
| 2 x | ≥ 2 x | 1648 | 1039 | |||
| or | ||||||
| ≥ 2 x | 2 x | |||||
| ≥ 3 x | ≥ 3 x | 1452 | 900 | 1452 | 900 | |
| ≥ 3 x | 2 x | 565 | 404 | 2017 | 1205 | |
| ≥ 3 x | 1 x | 717 | 513 | 2734 | 1570 | |
| 2 x | ≥ 3 x | 537 | 372 | 3271 | 1821 | |
| 2 x | 2 x | 546 | 363 | 3817 | 2053 | |
| 2 x | 1 x | 1045 | 707 | 4862 | 2548 | |
| ≥ 3 x | 0 | 481 | 283 | 5353 | 2775 | |
| 2 x | 0 | 1673 | 830 | 7016 | 3403 | |
Polymorphic bases sampled with low redundancy (rows 1 and 2) were not further analyzed. In contrast, rows 4 and 5 illustrate polymorphic sites with a minimum sampling depth of threefold for both inbred lines, and, as a result, have the highest confidence. The remaining rows summarize alignments that predict SNPs with decreasing confidence levels. Sub-categories that are grouped together were pooled for analysis.
MAGIs are gene-enriched maize genomic sequence assemblies that are likely to contain only a single gene or gene fragment (Emrich ; Fu ).
Numbers represent a non-redundant collection at each row.
Number of putative SNPs, depth at each SNP site by inbred line, and estimates of the potential number of polymorphic maize genes adjusted for validation rates
| 454 EST component depths | |||||||
|---|---|---|---|---|---|---|---|
| Mo17 | B73 | Number of putative SNPs ( | Validation rate | Estimated number of valid SNP sites | Number of MAGI 3.1 anchors | Number of additive SNPs | Additive minimum estimate of genes impacted |
| ≥ 3 x | ≥ 2 x | 2017 | 0.885 | 1785 | 1154 | 1785 | 1066 |
| ≥ 2 x | 0–1 x | 3916 | |||||
| 1–2 x | ≥ 2 x | 1083 | 0.64 | 3199 | 1963 | 4984 | 2472 |
Validation of 110 putative B73/Mo17 SNPs divided into two groups was performed by sequencing the corresponding B73 and Mo17 alleles using Sanger technology. Using the validation rates obtained, the number of SNPs that could be validated was estimated. Because many MAGIs correspond to single genes (see Results), the number of non-redundant MAGI anchors was used to generate the estimate of the number of genes impacted. Depths that are grouped together were pooled for analysis.
Numbers are corrected for validation rate (see text).
MAGIs are gene-enriched maize genomic sequence assemblies that are likely to contain only a single gene or gene fragment (Emrich ; Fu ). These numbers represent a non-redundant collection of MAGIs (see text).
Numbers represent a non-redundant collection See comment above at each row.
Figure 1A portion of the CROSS_MATCH-produced, template-driven, padded alignment between B73 and Mo17 454 EST sequences and the high-quality MAGI_105195 sequence assembly constructed from the B73 maize genomic survey sequence that serves as an alignment template. A G/A polymorphism occurs at position 2846 of the template (green highlight), with the Mo17 allele (A) in red and the B73 allele (G) in blue. Two insertions have occurred (yellow), one within a Mo17 454 EST and the second within a B73 454 EST. Because these insertions are not supported by other sequences, they are easily identified as errors by the POLYBAYSE pipeline and are not called as polymorphisms.