| Literature DB >> 25506519 |
Paul D Blischak1, Aaron J Wenzel1, Andrea D Wolfe1.
Abstract
PREMISE OF THE STUDY: Penstemon (Plantaginaceae) is a large and diverse genus endemic to North America. However, determining the phylogenetic relationships among its 280 species has been difficult due to its recent evolutionary radiation. The development of a large, multilocus data set can help to resolve this challenge. •Entities:
Keywords: 454 pyrosequencing; BLAST; MAKER2; Penstemon; bioinformatics
Year: 2014 PMID: 25506519 PMCID: PMC4259454 DOI: 10.3732/apps.1400044
Source DB: PubMed Journal: Appl Plant Sci ISSN: 2168-0450 Impact factor: 1.936
Fig. 1.Workflow used for marker development from six low-coverage Penstemon genomes using the MAKER2 Annotation Pipeline.
MAKER2 dependencies, version numbers, and websites.
| Program | Version | Website |
| MAKER2 ( | update 07-22-2012 | |
| RepeatMasker ( | open-3.3.0 | |
| RMBLAST ( | 2.2.27 | |
| RepBase | update 20120418 | |
| RM database | update 20120418 | |
| SNAP ( | N/A; downloaded 15 Dec. 2012 | |
| Legacy BLAST ( | 2.2.26 | ftp://ftp.ncbi.nlm.nih.gov/blast/executables/release/LATEST/ |
| BLAST+ ( | 2.2.27 | ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/LATEST |
| Exonerate ( | 2.2.0 | |
| Perl | 5.10 | |
| BioPerl ( | 1.6.1 | |
| Augustus | 2.5 |
URLs for websites as of 14 November 2014.
Required for installation but not used in our analyses.
Low-coverage WGS sequencing and assembly statistics for Penstemon centranthifolius and P. grinnellii.
| Statistics | ||
| Sequencing statistics | ||
| Total number of reads | 301,622 | 218,457 |
| Total base pairs sequenced | 129,272,235 | 92,989,517 |
| Assembly statistics | ||
| % reads assembled | 33.04% | 36.02% |
| Total assembled base pairs | 4,719,701 | 3,874,098 |
| No. of contigs | 8436 | 6927 |
| No. of contigs (>500 bp) | 3200 | 2677 |
| Contig N50 (>500 bp) | 984 | 1012 |
Comparison of the amount of output for sequences identified to contain gene regions using MAKER2, SNAP, BLASTN, and BLASTX. Results for unrestricted BLASTN/BLASTX searches are given along with the searches that reported only the best hit.
| MAKER2 annotations | SNAP gene predictions | BLASTN unrestricted/best hit | BLASTX unrestricted/best hit | |
| 486 | 2437 | 823/77 | 150,172/1839 | |
| 365 | 1898 | 805/79 | 88,619/1378 | |
| 238 | 784 | 659/164 | 107,867/1511 | |
| 238 | 827 | 695/169 | 101,621/1561 | |
| 230 | 801 | 779/197 | 103,041/1563 | |
| 338 | 1689 | 858/228 | 173,980/2740 |
Fig. 2.Comparing MAKER2 annotations to best-hit BLAST searches against ESTs (BLASTN) and protein sequences (BLASTX) for the six species of Penstemon sequenced. Mean sequence lengths are plotted as dashed, vertical lines. Means ± SEs are also given in the upper right corner of each graph.
Fig. 3.Plot of pairwise comparisons of sequence variation among the six low-coverage genomes using BLASTN (Cent = P. centranthifolius, Cyan = P. cyananthus, Davs = P. davidsonii, Diss = P. dissectus, Frut = P. fruticosus, Grin = P. grinnellii). Rows represent the species used as the database, and columns represent the species used as the query (e.g., row Cent, column Grin represents a BLASTN search with P. grinnellii as the query and P. centranthifolius as the database). Mean sequence variation ± SEs and sample size are shown in the upper right corner of each graph. Note that the matrix is not symmetric due to differences between using the same set of sequences as both a query and as a database for a BLAST search (e.g., Frut vs. Cyan ≠ Cyan vs. Frut).