| Literature DB >> 22039173 |
Christine E Edwards1, Thomas L Parchman, Carl W Weekley.
Abstract
Large-scale DNA sequence data may enable development of genetic resources in endangered species, thereby facilitating conservation efforts. Ziziphus celata, a federally endangered, self-incompatible plant species occurring in Florida, USA, is one species for which genetic resources are necessary to facilitate new introductions and augmentations essential for recovery of the species. We used 454 pyrosequencing of a Z. celata normalized floral cDNA library to create a genomic resource for gene and marker discovery. A half-plate GS-FLX Titanium run yielded 655 337 reads averaging 250 bp. A total of 474 025 reads were assembled de novo into 84 645 contigs averaging 408 bp, while 181 312 reads remained unassembled. Forty-seven and 43% of contig consensus sequences had BLAST matches to known proteins in the Uniref50 and TAIR9 annotated protein databases, respectively; many contigs fully represented orthologous proteins in TAIR9. A total of 22 707 unique genes were sequenced, indicating substantial coverage of the Z. celata transcriptome. We detected single-nucleotide polymorphisms and simple sequence repeats (SSRs) and developed thousands of SSR primers for use in future genetic studies. As a first step towards understanding self-incompatibility in Z. celata, we identified sequences belonging to the gene family encoding self-incompatibility. This study demonstrates the efficacy of 454 transcriptome sequencing for rapid gene and marker discovery in an endangered plant.Entities:
Mesh:
Substances:
Year: 2011 PMID: 22039173 PMCID: PMC3276261 DOI: 10.1093/dnares/dsr037
Source DB: PubMed Journal: DNA Res ISSN: 1340-2838 Impact factor: 4.458
The number and percentage of the total Ziziphus celata 454 contigs and singletons that matched to annotated protein databases
| Contigs | Singletons | Combined set | ||||
|---|---|---|---|---|---|---|
| Database | Total hits | Unique | Total hits | Unique | Total hits | Unique hits |
| Uniref | 39 815 (47%) | 17 420 | 46 124 (25%) | 16 332 | 85 939 (32%) | 22 707 |
| TAIR | 36 801 (43%) | 13 514 | 42 707 (23%) | 12 479 | 79 508 (30%) | 16 089 |
Figure 1.Functional gene diversity in the Z. celata transcriptome data. Bars represent the number of assignments of Z. celata proteins with BLAST matches in the TAIR9 database to each GO term.
Figure 2.The ratio of Z. celata contig length to A. thaliana orthologue length as a function of contig coverage depth. The dotted line corresponds to a ratio of 1, where 454 contigs are as long or longer than the BLAST matched A. thaliana orthologues.
Number of unique BLAST matches to annotated proteins in different taxonomic groups
| Taxonomic group | Unique BLAST hits |
|---|---|
| Plants | 19 790 |
| Algae | 100 |
| Other eukaryote | 496 |
| Fungi | 138 |
| Bacteria | 131 |
| Virus | 553 |
| Other | 1499 |
Number of each type of SSRs detected in the Z. celata transcriptome. Values in parentheses indicate the number of sequences for which PCR primers were successfully designed.
| Repeat motif | Contigs | Singletons | Total |
|---|---|---|---|
| Di | 3493 (2145) | 6620 (2075) | 10 113 (4220) |
| Tri | 4195 (3836) | 11 638 (3188) | 15 833 (7024) |
| Tetra | 3266 (2313) | 15 329 (2379) | 18 595 (4692) |
| Total | 10 954 (8294) | 33 587 (7643) | 44 541 (15 936) |